## 1.2 Series, Data Frames and CSVs

**Importing pandas into our notebook:**

In [1]:
import pandas as pd

**There are two main data types:**
* Series
* DataFrame

In [2]:
# Series = 1-dimensional
series = pd.Series(["BMW", "Toyota", "Honda"])
series

0       BMW
1    Toyota
2     Honda
dtype: object

In [3]:
colours = pd.Series(["Red", "Blue", "White"])
colours

0      Red
1     Blue
2    White
dtype: object

In [5]:
# DataFrame = 2-dimensional (takes a dictionary)
car_data = pd.DataFrame({"Car make": series, "Colour": colours})
car_data

Unnamed: 0,Car make,Colour
0,BMW,Red
1,Toyota,Blue
2,Honda,White


**We can also import data:**

In [6]:
# Step 1. Download or export spreadsheets to csv files and move into project folders.
# Step 2. Import csv into jupyter notebook:
car_sales = pd.read_csv("car-sales.csv")
car_sales

Unnamed: 0,Make,Colour,Odometer (KM),Doors,Price
0,Toyota,White,150043,4,"$4,000.00"
1,Honda,Red,87899,4,"$5,000.00"
2,Toyota,Blue,32549,3,"$7,000.00"
3,BMW,Black,11179,5,"$22,000.00"
4,Nissan,White,213095,4,"$3,500.00"
5,Toyota,Green,99213,4,"$4,500.00"
6,Honda,Blue,45698,4,"$7,500.00"
7,Honda,Blue,54738,4,"$7,000.00"
8,Toyota,White,60000,4,"$6,250.00"
9,Nissan,White,31600,4,"$9,700.00"


**Anatomy of a DataFrame:**
* axis = 0 --> row
* axis = 1 --> column
* The first column is the index number

**Exporting DataFrames:**

In [7]:
#Exporting DataFrame to csv file
car_sales.to_csv("exported-car-sales.csv", index=False)

# #Exporting DataFrame to excel file
# car_sales.to_excel("exported-car-sales.csv", index=False)

#Setting the index equal to false prevents the DataFrame from being exported with another index column

In [8]:
exported_car_sales = pd.read_csv("exported-car-sales.csv")
exported_car_sales

Unnamed: 0,Make,Colour,Odometer (KM),Doors,Price
0,Toyota,White,150043,4,"$4,000.00"
1,Honda,Red,87899,4,"$5,000.00"
2,Toyota,Blue,32549,3,"$7,000.00"
3,BMW,Black,11179,5,"$22,000.00"
4,Nissan,White,213095,4,"$3,500.00"
5,Toyota,Green,99213,4,"$4,500.00"
6,Honda,Blue,45698,4,"$7,500.00"
7,Honda,Blue,54738,4,"$7,000.00"
8,Toyota,White,60000,4,"$6,250.00"
9,Nissan,White,31600,4,"$9,700.00"


**How to import from a url:**

In some of the lectures, you'll notice .csv files being imported from file using something like:

heart_disease = pd.read_csv("data/heart-disease.csv")

This is helpful if you have the data downloaded to your computer or in the same directory as your notebook.

But if you don't, another great feature of pandas is being able to import .csv files directly from a URL.

For example, for the heart-disease.csv file, using the read_csv() function you can directly import it using the URL from the course GitHub repo:

heart_disease = pd.read_csv("https://raw.githubusercontent.com/mrdbourke/zero-to-mastery-ml/master/data/heart-disease.csv")

Note: If you're using a link from GitHub, make sure it's in the "raw" format, by clicking the raw button.

In [13]:
# From course github repo (https://github.com/mrdbourke/zero-to-mastery-ml) > data > heart-disease.csv > Raw > copy url and paste in code below.
heart_disease = pd.read_csv('https://raw.githubusercontent.com/mrdbourke/zero-to-mastery-ml/master/data/heart-disease.csv')
heart_disease.head()

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
0,63,1,3,145,233,1,0,150,0,2.3,0,0,1,1
1,37,1,2,130,250,0,1,187,0,3.5,0,0,2,1
2,41,0,1,130,204,0,0,172,0,1.4,2,0,2,1
3,56,1,1,120,236,0,1,178,0,0.8,2,0,2,1
4,57,0,0,120,354,0,1,163,1,0.6,2,0,2,1


In [17]:
# We can export the dataframe to a csv file using the to_csv() method.
heart_disease.to_csv('exported-heart-disease.csv', index=False)

In [19]:
#Lets check the exported file
exported_heart_disease = pd.read_csv('exported-heart-disease.csv')
exported_heart_disease.head()

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
0,63,1,3,145,233,1,0,150,0,2.3,0,0,1,1
1,37,1,2,130,250,0,1,187,0,3.5,0,0,2,1
2,41,0,1,130,204,0,0,172,0,1.4,2,0,2,1
3,56,1,1,120,236,0,1,178,0,0.8,2,0,2,1
4,57,0,0,120,354,0,1,163,1,0.6,2,0,2,1
