![Pandas Logo](https://www.kindpng.com/picc/m/574-5747046_python-pandas-logo-transparent-hd-png-download.png)

<h3 style="background-color:blue; font-family: Monaco" >Creating, Reading and Writing</h3>

In [1]:
import pandas as pd

<h3 style="color:red; font-family: Monaco">Basic DataFrame</h3>

In [2]:
'''
Creating data
There are 2 core object in pandas is DataFrame and Series 
1. DataFrame is a table. It contains an array of individual entries
'''
pd.DataFrame({'Yes':[50, 21], 'No': [131, 2]})


Unnamed: 0,Yes,No
0,50,131
1,21,2


In [3]:
#DataFrame can also contains string value
pd.DataFrame({'Name': ['TuyenDT', 'DTTuyen'], 'School':['FPT', 'NBK']})

Unnamed: 0,Name,School
0,TuyenDT,FPT
1,DTTuyen,NBK


* We are using `pd.DataFrame()` to generate these DataFrame object. The syntax for declaring a new one is a dictionary whose **keys** are the **column** names and whose values are a list of entries
* The dictionary-list constructor assign values to the column labels, but just use the ascending count from 0 for the row. We can change it by specific **index** parameter.

In [4]:
pd.DataFrame({'Name Product': ['Chum Bat Trang', 'Chum Dung Gao'], 'Price':[12000, 20000],}, index=['Product1', 'Product2'])

Unnamed: 0,Name Product,Price
Product1,Chum Bat Trang,12000
Product2,Chum Dung Gao,20000


<h3 style="color:red; font-family: Monaco">Basic Series</h3>

A `Series` is a **sequence of data values**. If DataFrame is table, a Series is a **list**

In [6]:
pd.Series([1,2,3,4,5])

0    1
1    2
2    3
3    4
4    5
dtype: int64

In [7]:
#you can also specify index 
pd.Series([30, 35, 40], index=[2015, 2016, 2017])

2015    30
2016    35
2017    40
dtype: int64

<h3 style="color: red; font-family: Monaco">Reading Data Files </h3>

* Most of time, we will be working with data have already exist
* Most basic of these is the csv file

In [8]:
#We use `pd.read_csv()` to read the data into a DataFrame
wine_review = pd.read_csv('winemag-data.csv')

In [9]:
#We can use shape attribute to check how large the resulting DataFrame
wine_review.shape
#The result let we know the dataset have gan 130k records and 14 columns


(129971, 14)

In [10]:
#use head() to examine contents of results
wine_review.head()

Unnamed: 0.1,Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery
0,0,Italy,"Aromas include tropical fruit, broom, brimston...",Vulkà Bianco,87,,Sicily & Sardinia,Etna,,Kerin O’Keefe,@kerinokeefe,Nicosia 2013 Vulkà Bianco (Etna),White Blend,Nicosia
1,1,Portugal,"This is ripe and fruity, a wine that is smooth...",Avidagos,87,15.0,Douro,,,Roger Voss,@vossroger,Quinta dos Avidagos 2011 Avidagos Red (Douro),Portuguese Red,Quinta dos Avidagos
2,2,US,"Tart and snappy, the flavors of lime flesh and...",,87,14.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Rainstorm 2013 Pinot Gris (Willamette Valley),Pinot Gris,Rainstorm
3,3,US,"Pineapple rind, lemon pith and orange blossom ...",Reserve Late Harvest,87,13.0,Michigan,Lake Michigan Shore,,Alexander Peartree,,St. Julian 2013 Reserve Late Harvest Riesling ...,Riesling,St. Julian
4,4,US,"Much like the regular bottling from 2012, this...",Vintner's Reserve Wild Child Block,87,65.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Sweet Cheeks 2012 Vintner's Reserve Wild Child...,Pinot Noir,Sweet Cheeks


As we see above csv file contain index column built-in but pandas dont pick that auto. To do this, we can specify an index_col

In [11]:
wine_review = pd.read_csv('winemag-data.csv', index_col=0)
wine_review.head()

Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery
0,Italy,"Aromas include tropical fruit, broom, brimston...",Vulkà Bianco,87,,Sicily & Sardinia,Etna,,Kerin O’Keefe,@kerinokeefe,Nicosia 2013 Vulkà Bianco (Etna),White Blend,Nicosia
1,Portugal,"This is ripe and fruity, a wine that is smooth...",Avidagos,87,15.0,Douro,,,Roger Voss,@vossroger,Quinta dos Avidagos 2011 Avidagos Red (Douro),Portuguese Red,Quinta dos Avidagos
2,US,"Tart and snappy, the flavors of lime flesh and...",,87,14.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Rainstorm 2013 Pinot Gris (Willamette Valley),Pinot Gris,Rainstorm
3,US,"Pineapple rind, lemon pith and orange blossom ...",Reserve Late Harvest,87,13.0,Michigan,Lake Michigan Shore,,Alexander Peartree,,St. Julian 2013 Reserve Late Harvest Riesling ...,Riesling,St. Julian
4,US,"Much like the regular bottling from 2012, this...",Vintner's Reserve Wild Child Block,87,65.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Sweet Cheeks 2012 Vintner's Reserve Wild Child...,Pinot Noir,Sweet Cheeks
