# File Management in Pandas

[Pandas](https://pandas.pydata.org/) is a Python library for data analysis and manipulation, which makes it easier to handle structured data like CSV, Excel spreadsheets, JSON, and SQL databases. 

`pandas` is an additional Python library, and it has to be installed before used. To install it:

```
pip install pandas
```

To make use of `pandas` in a Python script you have to import it 

```
import pandas
```

However, it's very common that `pandas` is imported as `pd`, hence in various examples (and even these guides) you can use:

```
import pandas as pd
```


## Reading Files with Pandas

Pandas provides various functions to read data from different file formats into DataFrame objects (these will be covered later on). A DataFrame is a two-dimensional tabular data structure with labeled axes (rows and columns).

Some of the most common file types that we will use are:
- **CSV Files** - use `pd.read_csv(filepath)`
- **Excel Files** - use `pd.read_excel(io, sheet_name=0)`
- **JSON Files** - use `pd.read_json(filepath|buffer)`
- **SQL Database** - use `pd.read_sql(sql,con)`


## Writing Files with Pandas

pandas also allows us to create files by writing the DataFrames back to disk. 

- **CSV Files** - use `DataFrame.to_csv(filepath)`
- **Excel Files** - use `DataFrame.to_excel(excel_writer, sheet_name='Sheet1')`
- **JSON Files** - use `DataFrame.to_json(filepath|buffer)`



## Examples

### Reading CSV Files

In [6]:
import pandas as pd

df = pd.read_csv("./assets/data.csv")
df.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,class
0,5.1,3.5,1.4,0.2,Iris-setosa
1,4.9,3.0,1.4,0.2,Iris-setosa
2,4.7,3.2,1.3,0.2,Iris-setosa
3,6.1,3.0,4.9,1.8,Iris-virginica
4,6.4,2.8,5.6,2.1,Iris-virginica


### Writing to Excel

> Note: To be able to read / write from Excel you need to install additional libraries
`pip install openpyxl xlrd`

In [10]:
df.to_excel("./assets/data.xlsx", sheet_name="Iris", index=False)

### Reading from Excel

In [12]:
df_excel = pd.read_excel("./assets/data.xlsx", sheet_name="Iris")
df_excel.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,class
0,5.1,3.5,1.4,0.2,Iris-setosa
1,4.9,3.0,1.4,0.2,Iris-setosa
2,4.7,3.2,1.3,0.2,Iris-setosa
3,6.1,3.0,4.9,1.8,Iris-virginica
4,6.4,2.8,5.6,2.1,Iris-virginica


### Reading a JSON File

In [13]:
df_json = pd.read_json("./assets/data.json")
df_json.head()

Unnamed: 0,id,name,username,email,address,phone,website,company
0,1,Leanne Graham,Bret,Sincere@april.biz,"{'street': 'Kulas Light', 'suite': 'Apt. 556',...",1-770-736-8031 x56442,hildegard.org,"{'name': 'Romaguera-Crona', 'catchPhrase': 'Mu..."
1,2,Ervin Howell,Antonette,Shanna@melissa.tv,"{'street': 'Victor Plains', 'suite': 'Suite 87...",010-692-6593 x09125,anastasia.net,"{'name': 'Deckow-Crist', 'catchPhrase': 'Proac..."
2,3,Clementine Bauch,Samantha,Nathan@yesenia.net,"{'street': 'Douglas Extension', 'suite': 'Suit...",1-463-123-4447,ramiro.info,"{'name': 'Romaguera-Jacobson', 'catchPhrase': ..."
3,4,Patricia Lebsack,Karianne,Julianne.OConner@kory.org,"{'street': 'Hoeger Mall', 'suite': 'Apt. 692',...",493-170-9623 x156,kale.biz,"{'name': 'Robel-Corkery', 'catchPhrase': 'Mult..."
4,5,Chelsey Dietrich,Kamren,Lucio_Hettinger@annie.ca,"{'street': 'Skiles Walks', 'suite': 'Suite 351...",(254)954-1289,demarco.info,"{'name': 'Keebler LLC', 'catchPhrase': 'User-c..."


### Writing to CSV File

In [15]:
df_json.to_csv("./assets/output.csv", index=False)