## 9.12.2 Reading CSV Files into Pandas `DataFrames` 
* Here, we demonstrate pandas’ ability to load files in CSV format, then perform some basic data-analysis tasks

### Datasets
* Enormous variety of free datasets available online
* **Rdatasets repository** provides links to over 1100 free datasets in comma-separated values (CSV) format
> https://vincentarelbundock.github.io/Rdatasets/datasets.html
* **`pydataset` module** specifically for accessing Rdatasets
> https://github.com/iamaziz/PyDataset
* Another large source of datasets is
> https://github.com/awesomedata/awesome-public-datasets
* A commonly used machine-learning dataset for beginners is the **Titanic disaster dataset**

### Working with Locally Stored CSV Files 
* File we'll process in this example

In [3]:
!cat accounts.csv

100,Jones,24.98
200,Doe,345.67
300,White,0.0
400,Stone,-42.16
500,Rich,224.62


* Load a CSV dataset into a `DataFrame` with the pandas function **`read_csv`**
* `names` argument specifies the `DataFrame`’s column names
    * Without this argument, `read_csv` assumes that the CSV file’s first row is a comma-delimited list of column names

In [4]:
import pandas as pd

In [5]:
df = pd.read_csv('accounts.csv', 
                 names=['account', 'name', 'balance'])

In [6]:
df

Unnamed: 0,account,name,balance
0,100,Jones,24.98
1,200,Doe,345.67
2,300,White,0.0
3,400,Stone,-42.16
4,500,Rich,224.62


* To save a `DataFrame` to a file using CSV format, call `DataFrame` method **`to_csv`**
* `index=False` indicates that the row names (`0`–`4` at the left of the `DataFrame`’s output above are not written to the file
* Resulting file contains the column names as the first row

In [7]:
df.to_csv('accounts_from_dataframe.csv', index=False)

In [8]:
!cat accounts_from_dataframe.csv

account,name,balance
100,Jones,24.98
200,Doe,345.67
300,White,0.0
400,Stone,-42.16
500,Rich,224.62
