# CSV File Handling


For more practical use cases, we often need to import and export data, and one of the most common formats for data storage is CSV (Comma-Separated Values). In this notebook, we will learn how to read from and write to CSV files using the `CSV.jl` package in Julia.


In [None]:
# Run this cell to install the CSV package
using Pkg
Pkg.add(["CSV", "DataFrames"])

Let's start by importing the necessary packages. We will use `CSV.jl` for handling CSV files and `DataFrames.jl` for working with tabular data.


In [2]:
using CSV, DataFrames

For example, we have a sample CSV file named `customers.csv` in the `assets` directory.

To read a CSV file into a DataFrame, we can use the `CSV.read` function.


In [3]:
customers_data = CSV.read("./assets/customers.csv", DataFrame)

display(customers_data[1:5, :])  # Display the first 5 rows
# To access a specific column, you can use the column name
display(customers_data[1:5, "First Name"]) # Display the first 5 entries of the "First Name" column

Row,Index,Customer Id,First Name,Last Name,Company,City,Country,Phone 1,Phone 2,Email,Subscription Date,Website
Unnamed: 0_level_1,Int64,String15,String15,String15,String31,String31,String,String31,String31,String,Date,String
1,1,DD37Cf93aecA6Dc,Sheryl,Baxter,Rasmussen Group,East Leonard,Chile,229.077.5154,397.884.0519x718,zunigavanessa@smith.info,2020-08-24,http://www.stephenson.com/
2,2,1Ef7b82A4CAAD10,Preston,Lozano,Vega-Gentry,East Jimmychester,Djibouti,5153435776,686-620-1820x944,vmata@colon.com,2021-04-23,http://www.hobbs.com/
3,3,6F94879bDAfE5a6,Roy,Berry,Murillo-Perry,Isabelborough,Antigua and Barbuda,+1-539-402-0259,(496)978-3969x58947,beckycarr@hogan.com,2020-03-25,http://www.lawrence.com/
4,4,5Cef8BFA16c5e3c,Linda,Olsen,"Dominguez, Mcmillan and Donovan",Bensonview,Dominican Republic,001-808-617-6467x12895,+1-813-324-8756,stanleyblackwell@benson.org,2020-06-02,http://www.good-lyons.com/
5,5,053d585Ab6b3159,Joanna,Bender,"Martin, Lang and Andrade",West Priscilla,Slovakia (Slovak Republic),001-234-203-0635x76146,001-199-446-3860x3486,colinalvarado@miles.net,2021-04-17,https://goodwin-ingram.com/


5-element Vector{String15}:
 "Sheryl"
 "Preston"
 "Roy"
 "Linda"
 "Joanna"

To access all rows we can use the `:` operator in the row position. And for all columns, we use `!` in the column position.


In [14]:
customers_data[!, :Company][1:10]  # Access the "Company" column using Symbol and display the first 10 entries

10-element Vector{String31}:
 "Rasmussen Group"
 "Vega-Gentry"
 "Murillo-Perry"
 "Dominguez, Mcmillan and Donovan"
 "Martin, Lang and Andrade"
 "Steele Group"
 "Lester, Woodard and Mitchell"
 "Sanford, Davenport and Giles"
 "Browning-Simon"
 "Beck-Hendrix"

To get a quick info about the data in each column, we can use the `describe` function from the `DataFrames` package. We can use `describe` as follows:


In [4]:
describe(customers_data)

Row,variable,mean,min,median,max,nmissing,eltype
Unnamed: 0_level_1,Symbol,Union…,Any,Any,Any,Int64,DataType
1,Index,50.5,1,50.5,100,0,Int64
2,Customer Id,,010468dAA11382c,,faCEF517ae7D8eB,0,String15
3,First Name,,Aimee,,Yvonne,0,String15
4,Last Name,,Alvarado,,Zuniga,0,String15
5,Company,,"Acosta, Petersen and Morrow",,Winters-Mendoza,0,String31
6,City,,Acevedoville,,Zimmermanland,0,String31
7,Country,,Albania,,Zimbabwe,0,String
8,Phone 1,,(041)737-3846,,981-544-9452,0,String31
9,Phone 2,,(026)401-7353x2417,,999-728-1637,0,String31
10,Email,,aharper@maddox-townsend.org,,zvalencia@phelps.com,0,String


Say we want to write a DataFrame to a CSV file. We can use the `CSV.write` function for this purpose. Let's create a new DataFrame and write it to a CSV file named `new_customers.csv` in the `assets` directory.


First let's create a new DataFrame with some sample data. Basic syntax for creating a DataFrame is as follows:

```julia
DataFrame(
    "Column1" => [value1, value2, ...],
    "Column2" => [value1, value2, ...],
    ...
)
```


In [5]:
new_customers = DataFrame(
    "First Name" => ["Alice", "Bob"],
    "Last Name" => ["Smith", "Johnson"],
    "Email" => ["alice_test@email.com", "bob_example@hotmail.com"],
    "Age" => [28, 34],
    "Country" => ["USA", "Canada"]
)

Row,First Name,Last Name,Email,Age,Country
Unnamed: 0_level_1,String,String,String,Int64,String
1,Alice,Smith,alice_test@email.com,28,USA
2,Bob,Johnson,bob_example@hotmail.com,34,Canada


In [6]:
CSV.write("./assets/new_customers.csv", new_customers)

"./assets/new_customers.csv"