# Pandas
## Loading data
We'll start by looking at loading data in Pandas. First we'll load a simple CSV file. The file looks like this:

```
sales_USD,year,genre
1561.2300000000005,2009,Rock
185.13,2009,Jazz
502.92000000000013,2009,Metal
466.29000000000025,2009,Alternative & Punk
```

We can load the data and look at the top and bottom of the file with `head` and `tail`:

In [17]:
import pandas as pd
df = pd.read_csv('chinook_data.csv')
df.head()

Unnamed: 0,sales_USD,year,genre
0,1561.23,2009,Rock
1,185.13,2009,Jazz
2,502.92,2009,Metal
3,466.29,2009,Alternative & Punk
4,13.86,2009,Rock And Roll


In [9]:
df.tail()

Unnamed: 0,sales_USD,year,genre
99,41.58,2013,World
100,6.93,2013,Hip Hop/Rap
101,25.86,2013,Science Fiction
102,157.15,2013,TV Shows
103,129.3,2013,Drama


The documentation for [`read_csv`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html) has a lot of options, all described well in the docs. One useful option combination is for parsing dates:

In [11]:
df = pd.read_csv('chinook_data.csv', parse_dates=['year'], infer_datetime_format=True)
df.head()

Unnamed: 0,sales_USD,year,genre
0,1561.23,2009-01-01,Rock
1,185.13,2009-01-01,Jazz
2,502.92,2009-01-01,Metal
3,466.29,2009-01-01,Alternative & Punk
4,13.86,2009-01-01,Rock And Roll


The column on the left is the index. Since we did not specify an index column when loading the data, it was generated for us.

In [13]:
df.index

RangeIndex(start=0, stop=104, step=1)

Write back to CSV with new date format. Since the index was automatically created, we will not write it to the file with `index=False`.

In [14]:
df.to_csv('chinook_data_2.csv', index=False)

You could now try loading the file again, or looking at it in a text editor or Excel or OpenOffice.

Sometimes we have other separators in our CSV files, like a tab character (`\t`) or a pipe (`|`). We can specify the separator with `read_csv`:

In [20]:
df = pd.read_csv('chinook_data.tsv', sep='\t')
df.head()

Unnamed: 0,sales_USD,year,genre
0,1561.23,2009,Rock
1,185.13,2009,Jazz
2,502.92,2009,Metal
3,466.29,2009,Alternative & Punk
4,13.86,2009,Rock And Roll


We can also leave the `sep` argument as `None`, and Python auto-detects it:

In [21]:
df = pd.read_csv('chinook_data.tsv', sep=None)
df.head()

  df = pd.read_csv('chinook_data.tsv', sep=None)


Unnamed: 0,sales_USD,year,genre
0,1561.23,2009,Rock
1,185.13,2009,Jazz
2,502.92,2009,Metal
3,466.29,2009,Alternative & Punk
4,13.86,2009,Rock And Roll


We can also read CSV from a URL.