Import
-

In [1]:
#Getting the lib
import pandas as pd

Creation of Dataframes
-

In [2]:
#Series - https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.html
pd.Series([30, 35, 40], index=['2015 Sales', '2016 Sales', '2017 Sales'])

2015 Sales    30
2016 Sales    35
2017 Sales    40
dtype: int64

In [3]:
#Dataframes - https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html 
pd.DataFrame({'Bob': ['I liked it.', 'It was awful.'], 
              'Sue': ['Pretty good.', 'Bland.']},
             index=['Product A', 'Product B'])

Unnamed: 0,Bob,Sue
Product A,I liked it.,Pretty good.
Product B,It was awful.,Bland.


Import Data, check data and export data
-

In [4]:
#From local CSV
my_local_data = pd.read_csv("file://localhost/users/nicki/desktop/udvikling/python/data/winemag-data-130k-v2.csv", index_col=0)

#From URL
my_web_data=pd.read_csv("https://apps.who.int/gho/athena/data/GHO/NCD_BMI_30A?filter=AGEGROUP:*;COUNTRY:*;SEX:*&x-sideaxis=COUNTRY&x-topaxis=YEAR;GHO;AGEGROUP;SEX&profile=crosstable&format=csv")


In [5]:
#quick overview
print(my_web_data.shape)
my_web_data.head()
my_web_data.describe()

(198, 127)


Unnamed: 0.1,Unnamed: 0,2016,2016.1,2016.2,2015,2015.1,2015.2,2014,2014.1,2014.2,...,1978.2,1977,1977.1,1977.2,1976,1976.1,1976.2,1975,1975.1,1975.2
count,196,198,198,198,198,198,198,198,198,198,...,198,198,198,198,198,198,198,198,198,198
unique,196,192,195,195,194,194,195,194,193,194,...,192,184,164,185,183,161,190,177,155,185
top,Kuwait,No data,No data,No data,No data,No data,No data,No data,No data,No data,...,No data,No data,0.2 [0.0-0.7],No data,No data,0.2 [0.0-0.7],No data,0.5 [0.2-1.1],0.4 [0.1-1.2],No data
freq,1,4,4,4,4,4,4,4,4,4,...,4,4,6,4,4,7,4,4,7,4


In [6]:
# Choosing a colum for index:
#some_df.set_index("colum_name",)

# renaming the index and collums ("colum index")
#some_df.index.names = ['index_title']
#some_df.columns.names = ['colums_title']

In [7]:
#Converting csv to dataframe
my_local_df=pd.DataFrame(my_local_data)

In [8]:
#Saving data locally in working dir
my_web_data.to_csv("data_from_web.csv")

Simple Data Access
-

In [9]:
#Access single values.
print(my_local_df["country"][1])
print(my_local_df.country[2])

# Accessing all values in a collum (attribute)
print(my_local_df.country)
print(my_local_df["points"])

Portugal
US
0            Italy
1         Portugal
2               US
3               US
4               US
            ...   
129966     Germany
129967          US
129968      France
129969      France
129970      France
Name: country, Length: 129971, dtype: object
0         87
1         87
2         87
3         87
4         87
          ..
129966    90
129967    90
129968    90
129969    90
129970    90
Name: points, Length: 129971, dtype: int64


Index-based selection iloc
-

In [10]:
# Both loc and iloc are row-first, column-second.
#When we use iloc we treat the dataset like a big matrix (a list of lists), one that we have to index into by position.
# To select the first row of data in a DataFrame, we may use the following:
print(my_local_df.iloc[0])

country                                                              Italy
description              Aromas include tropical fruit, broom, brimston...
designation                                                   Vulkà Bianco
points                                                                  87
price                                                                  NaN
province                                                 Sicily & Sardinia
region_1                                                              Etna
region_2                                                               NaN
taster_name                                                  Kerin O’Keefe
taster_twitter_handle                                         @kerinokeefe
title                                    Nicosia 2013 Vulkà Bianco  (Etna)
variety                                                        White Blend
winery                                                             Nicosia
Name: 0, dtype: object


In [11]:
# It is marginally harder to get retrieve columns. To get a column with iloc, we can do the following:
print(my_local_df.iloc[0:, 0])

0            Italy
1         Portugal
2               US
3               US
4               US
            ...   
129966     Germany
129967          US
129968      France
129969      France
129970      France
Name: country, Length: 129971, dtype: object


In [12]:
#It's also possible to pass a list:
my_local_df.iloc[[0, 1, 10], 0]

0        Italy
1     Portugal
10          US
Name: country, dtype: object

Label-based selection loc
-

In [13]:
# loc operator: label-based selection. In this paradigm, it's the data index value, not its position, which matters.
# Get the first entry in reviews
print(my_local_df.loc[0, 'country'])


Italy


In [14]:
#loc, by contrast, uses the information in the indices to do its work. Since your dataset usually has meaningful indices, it's usually easier to do things using loc instead. For example, here's one operation that's much easier using loc:
my_local_df.loc[0:2, ['taster_name', 'taster_twitter_handle', 'points']]

Unnamed: 0,taster_name,taster_twitter_handle,points
0,Kerin O’Keefe,@kerinokeefe,87
1,Roger Voss,@vossroger,87
2,Paul Gregutt,@paulgwine,87


In [15]:
# Choosing between loc and iloc

# When choosing or transitioning between loc and iloc, there is one "gotcha" worth keeping in mind, which is that the two methods use slightly different indexing schemes.

# iloc uses the Python stdlib indexing scheme, where the first element of the range is included and the last one excluded. So 0:10 will select entries 0,...,9. loc, meanwhile, indexes inclusively. So 0:10 will select entries 0,...,10.


# This is particularly confusing when the DataFrame index is a simple numerical list, e.g. 0,...,1000. In this case df.iloc[0:1000] will return 1000 entries, while df.loc[0:1000] return 1001 of them! To get 1000 elements using loc, you will need to go one lower and ask for df.loc[0:999].

# Otherwise, the semantics of using loc are the same as those for iloc.

Conditional Selection
-

In [16]:
# Chosing all rows which follow a condition
my_local_df.loc[my_local_df.country == 'Italy']

# Applying OR statements we use a pipe (|):
my_local_df.loc[(my_local_df.country == 'Italy') | (my_local_df.points >= 90)]

# Using AND statements (&) to apply more conditions:
my_local_df.loc[(my_local_df.country == 'Italy') & (my_local_df.points >= 100)].head(2)

Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery
7335,Italy,Thick as molasses and dark as caramelized brow...,Occhio di Pernice,100,210.0,Tuscany,Vin Santo di Montepulciano,,,,Avignonesi 1995 Occhio di Pernice (Vin Santo ...,Prugnolo Gentile,Avignonesi
39286,Italy,"A perfect wine from a classic vintage, the 200...",Masseto,100,460.0,Tuscany,Toscana,,,,Tenuta dell'Ornellaia 2007 Masseto Merlot (Tos...,Merlot,Tenuta dell'Ornellaia


In [17]:
# The first is isin. isin is lets you select data whose value "is in" a list of values. For example, here's how we can use it to select wines only from Italy or France:
my_local_df.loc[my_local_df.country.isin(['Italy', 'France']) & (my_local_df.points >= 100)].head() 

#isnull (and notnull). 
my_local_df.loc[my_local_df.designation.notnull() & my_local_df.country.isin(['Italy', 'France']) & (my_local_df.points >= 100)].head(2)

Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery
7335,Italy,Thick as molasses and dark as caramelized brow...,Occhio di Pernice,100,210.0,Tuscany,Vin Santo di Montepulciano,,,,Avignonesi 1995 Occhio di Pernice (Vin Santo ...,Prugnolo Gentile,Avignonesi
36528,France,This is a fabulous wine from the greatest Cham...,Brut,100,259.0,Champagne,Champagne,,Roger Voss,@vossroger,Krug 2002 Brut (Champagne),Champagne Blend,Krug


Assigning Data
-

In [18]:
# Going the other way, assigning data to a DataFrame is easy. You can assign either a constant value:
my_local_df['critic'] = 'everyone'
print(my_local_df['critic'].head())

#Or with an iterable of values:
my_local_df['index_backwards'] = range(len(my_local_df), 0, -1)
print(my_local_df['index_backwards'].head())

0    everyone
1    everyone
2    everyone
3    everyone
4    everyone
Name: critic, dtype: object
0    129971
1    129970
2    129969
3    129968
4    129967
Name: index_backwards, dtype: int32


Summary Functions
-

In [19]:
# Type ware high-level summary of the attributes of the given column
print(my_local_df.points.describe())
print(my_local_df.country.describe())

count    129971.000000
mean         88.447138
std           3.039730
min          80.000000
25%          86.000000
50%          88.000000
75%          91.000000
max         100.000000
Name: points, dtype: float64
count     129908
unique        43
top           US
freq       54504
Name: country, dtype: object


In [20]:
#The mean of a float:
my_local_df.points.mean()

88.44713820775404

In [21]:
#List of unique values:
my_local_df.taster_name.unique()

array(['Kerin O’Keefe', 'Roger Voss', 'Paul Gregutt',
       'Alexander Peartree', 'Michael Schachner', 'Anna Lee C. Iijima',
       'Virginie Boone', 'Matt Kettmann', nan, 'Sean P. Sullivan',
       'Jim Gordon', 'Joe Czerwinski', 'Anne Krebiehl\xa0MW',
       'Lauren Buzzeo', 'Mike DeSimone', 'Jeff Jenssen',
       'Susan Kostrzewa', 'Carrie Dykes', 'Fiona Adams',
       'Christina Pickard'], dtype=object)

In [22]:
#List of unique values and how often they occur in the dataset:
my_local_df.taster_name.value_counts().head()

Roger Voss           25514
Michael Schachner    15134
Kerin O’Keefe        10776
Virginie Boone        9537
Paul Gregutt          9532
Name: taster_name, dtype: int64

Changing all with data map() , apply() and built-in
-

In [23]:
# A map is a function that takes one set of values and "maps" them to another set of values. There are two mapping methods that you will use often. To re-mean the values
my_local_mean = my_local_df.points.mean()
print(my_local_df.points.map(lambda x: x - my_local_mean).head())

#Check how many times a word appear
print(my_local_df.description.map(lambda x: "tropical" in x).sum())

#The function you pass to map() should expect a single value from the Series (a point value, in the above example), and return a transformed version of that value. map() returns a new Series where all the values have been transformed by your function.

0   -1.447138
1   -1.447138
2   -1.447138
3   -1.447138
4   -1.447138
Name: points, dtype: float64
3607


In [24]:
#apply() If we want to transform a whole DataFrame by calling a custom method on each row.
my_local_mean = my_local_df.points.mean()
def remean_points(row):
    row.points = row.points - my_local_mean
    return row
#my_local_df.apply(remean_points, axis='columns')


# Making a apply function
def stars(row):
    if row.country == 'Canada':
        return 3
    elif row.points >= 95:
        return 3
    elif row.points >= 85:
        return 2
    else:
        return 1
#df.apply(stars, axis='columns')

In [25]:
# Pandas Built-ins
my_local_df.points - 10

# Combining strings
my_local_df.country + " - " + my_local_df.region_1

# These operators are faster than map() or apply() because they uses speed ups built into pandas. All of the standard Python operators (>, <, ==, and so on) work in this manner. However, they are not as flexible as map() or apply(), which can do more advanced things, like applying conditional logic, which cannot be done with addition and subtraction alone.

#finding the wine which is the best bargain
points_to_price = my_local_df.points / my_local_df.price
my_local_df.title[points_to_price.idxmax()]

'Bandit NV Merlot (California)'

Groupwize analysis - slicing based on value in colum
-

In [26]:
#Count observations whith a given attribute
my_local_df.groupby('points').size().head()

# We can use any of the summary functions 
#.max()
#.min()
#.mean()

points
80     397
81     692
82    1836
83    3025
84    6480
dtype: int64

In [27]:
# For example, here's one way of selecting the name of the first wine reviewed from each winery in the dataset:
my_local_df.groupby('winery').apply(lambda df: my_local_df.title.iloc[0]).head()



winery
1+1=3               Nicosia 2013 Vulkà Bianco  (Etna)
10 Knots            Nicosia 2013 Vulkà Bianco  (Etna)
100 Percent Wine    Nicosia 2013 Vulkà Bianco  (Etna)
1000 Stories        Nicosia 2013 Vulkà Bianco  (Etna)
1070 Green          Nicosia 2013 Vulkà Bianco  (Etna)
dtype: object

In [28]:
#For even more fine-grained control, you can also group by more than one column. For an example, here's how we would pick out the best wine by country and province:
my_local_df.groupby(['country', 'province']).apply(lambda df: my_local_df.loc[my_local_df.points.idxmax()])

#Groupby with more arguments make a multi index Dataframe
my_local_df.groupby(['country', 'province']).size()


country    province        
Argentina  Mendoza Province    3264
           Other                536
Armenia    Armenia                2
Australia  Australia Other      245
           New South Wales       85
                               ... 
Uruguay    Juanico               12
           Montevideo            11
           Progreso              11
           San Jose               3
           Uruguay               24
Length: 425, dtype: int64

In [29]:
#Another groupby() method worth mentioning is agg(), which lets you run a bunch of different functions on your DataFrame simultaneously.
stat_df=my_local_df.groupby(['country']).price.agg([len, min, max]).head()
stat_df

Unnamed: 0_level_0,len,min,max
country,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Argentina,3800.0,4.0,230.0
Armenia,2.0,14.0,15.0
Australia,2329.0,5.0,850.0
Austria,3345.0,7.0,1100.0
Bosnia and Herzegovina,2.0,12.0,13.0


Sorting & Reindexing
-

In [30]:
#New index
stat_df = stat_df.reset_index()
print(stat_df)

#Sorting based on something else than index
print(stat_df.sort_values(by='len',ascending=False))

# To sort by index values, use the companion method sort_index(). 
print(stat_df.sort_index())


country     len   min     max
0               Argentina  3800.0   4.0   230.0
1                 Armenia     2.0  14.0    15.0
2               Australia  2329.0   5.0   850.0
3                 Austria  3345.0   7.0  1100.0
4  Bosnia and Herzegovina     2.0  12.0    13.0
                  country     len   min     max
0               Argentina  3800.0   4.0   230.0
3                 Austria  3345.0   7.0  1100.0
2               Australia  2329.0   5.0   850.0
1                 Armenia     2.0  14.0    15.0
4  Bosnia and Herzegovina     2.0  12.0    13.0
                  country     len   min     max
0               Argentina  3800.0   4.0   230.0
1                 Armenia     2.0  14.0    15.0
2               Australia  2329.0   5.0   850.0
3                 Austria  3345.0   7.0  1100.0
4  Bosnia and Herzegovina     2.0  12.0    13.0


In [31]:
# Finally, know that you can sort by more than one column at a time:
stat_df.sort_values(by=['country', 'len'])


Unnamed: 0,country,len,min,max
0,Argentina,3800.0,4.0,230.0
1,Armenia,2.0,14.0,15.0
2,Australia,2329.0,5.0,850.0
3,Austria,3345.0,7.0,1100.0
4,Bosnia and Herzegovina,2.0,12.0,13.0


Data Types & Missing Values
-

In [32]:
# The data type for a column in a DataFrame or a Series is known as the dtype.You can use the dtype property to grab the type of a specific column. For instance, we can get the dtype of the price column in the reviews DataFrame:
my_local_df.price.dtype

dtype('float64')

In [33]:
# Alternatively, the dtypes property returns the dtype of every column in the DataFrame:
my_local_df.dtypes.head(4)

country        object
description    object
designation    object
points          int64
dtype: object

In [34]:
#It's possible to convert a column of one type into another wherever such a conversion makes sense by using the astype() function. For example, we may transform the points column from its existing int64 data type into a float64 data type:
my_local_df.points.astype('float64').head(2)

0    87.0
1    87.0
Name: points, dtype: float64

In [35]:
# Entries missing values are given the value NaN, short for "Not a Number". For technical reasons these NaN values are always of the float64 dtype.

# Pandas provides some methods specific to missing data. To select NaN entries you can use pd.isnull() (or its companion pd.notnull()). This is meant to be used thusly:

my_local_df[pd.isnull(my_local_df.country)].head(2)

Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery,critic,index_backwards
913,,"Amber in color, this wine has aromas of peach ...",Asureti Valley,87,30.0,,,,Mike DeSimone,@worldwineguys,Gotsa Family Wines 2014 Asureti Valley Chinuri,Chinuri,Gotsa Family Wines,everyone,129058
3131,,"Soft, fruity and juicy, this is a pleasant, si...",Partager,83,,,,,Roger Voss,@vossroger,Barton & Guestier NV Partager Red,Red Blend,Barton & Guestier,everyone,126840


In [36]:
#fillna(). fillna() provides a few different strategies for mitigating Null. 
my_local_df.region_2.fillna("Unknown").head()



0              Unknown
1              Unknown
2    Willamette Valley
3              Unknown
4    Willamette Valley
Name: region_2, dtype: object

In [37]:
#Alternatively we can use the replace() method:
my_local_df.taster_twitter_handle.replace("@kerinokeefe", "@kerino").head(3)

0        @kerino
1     @vossroger
2    @paulgwine 
Name: taster_twitter_handle, dtype: object

Renaming
-

In [38]:
# Renaming¶
my_local_df.rename(columns={'points': 'score'}).head(1)

Unnamed: 0,country,description,designation,score,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery,critic,index_backwards
0,Italy,"Aromas include tropical fruit, broom, brimston...",Vulkà Bianco,87,,Sicily & Sardinia,Etna,,Kerin O’Keefe,@kerinokeefe,Nicosia 2013 Vulkà Bianco (Etna),White Blend,Nicosia,everyone,129971


In [39]:
# rename() lets you rename index or column values by specifying a index or column keyword parameter, respectively.
my_local_df.rename(index={0: 'firstEntry', 1: 'secondEntry'}).head(3)

Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery,critic,index_backwards
firstEntry,Italy,"Aromas include tropical fruit, broom, brimston...",Vulkà Bianco,87,,Sicily & Sardinia,Etna,,Kerin O’Keefe,@kerinokeefe,Nicosia 2013 Vulkà Bianco (Etna),White Blend,Nicosia,everyone,129971
secondEntry,Portugal,"This is ripe and fruity, a wine that is smooth...",Avidagos,87,15.0,Douro,,,Roger Voss,@vossroger,Quinta dos Avidagos 2011 Avidagos Red (Douro),Portuguese Red,Quinta dos Avidagos,everyone,129970
2,US,"Tart and snappy, the flavors of lime flesh and...",,87,14.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Rainstorm 2013 Pinot Gris (Willamette Valley),Pinot Gris,Rainstorm,everyone,129969


In [40]:
# You'll probably rename columns very often, but rename index values very rarely. For that, set_index() is usually more convenient.
# Both the row index and the column index can have their own name attribute. The complimentary rename_axis() method may be used to change these names. For example:

my_local_df.rename_axis("Wines", axis='rows').rename_axis("Fields", axis='columns').head(1)

Fields,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery,critic,index_backwards
Wines,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
0,Italy,"Aromas include tropical fruit, broom, brimston...",Vulkà Bianco,87,,Sicily & Sardinia,Etna,,Kerin O’Keefe,@kerinokeefe,Nicosia 2013 Vulkà Bianco (Etna),White Blend,Nicosia,everyone,129971


Combining 
-

In [41]:
# The simplest combining method is concat(). it requires that both Dataframes have same colums

canadian_youtube = pd.read_csv("file://localhost/users/nicki/desktop/udvikling/python/data/CAvideos.csv")
british_youtube = pd.read_csv("file://localhost/users/nicki/desktop/udvikling/python/data/GBvideos.csv")

print(list(canadian_youtube))
print(list(british_youtube))
pd.concat([canadian_youtube, british_youtube]).head(1)



['video_id', 'trending_date', 'title', 'channel_title', 'category_id', 'publish_time', 'tags', 'views', 'likes', 'dislikes', 'comment_count', 'thumbnail_link', 'comments_disabled', 'ratings_disabled', 'video_error_or_removed', 'description']
['video_id', 'trending_date', 'title', 'channel_title', 'category_id', 'publish_time', 'tags', 'views', 'likes', 'dislikes', 'comment_count', 'thumbnail_link', 'comments_disabled', 'ratings_disabled', 'video_error_or_removed', 'description']


Unnamed: 0,video_id,trending_date,title,channel_title,category_id,publish_time,tags,views,likes,dislikes,comment_count,thumbnail_link,comments_disabled,ratings_disabled,video_error_or_removed,description
0,n1WpP7iowLc,17.14.11,Eminem - Walk On Water (Audio) ft. Beyoncé,EminemVEVO,10,2017-11-10T17:00:03.000Z,"Eminem|""Walk""|""On""|""Water""|""Aftermath/Shady/In...",17158579,787425,43420,125882,https://i.ytimg.com/vi/n1WpP7iowLc/default.jpg,False,False,False,Eminem's new track Walk on Water ft. Beyoncé i...


In [42]:
#The middlemost combiner in terms of complexity is join(). join() lets you combine different DataFrame objects which have an index in common. For example, to pull down videos that happened to be trending on the same day in both Canada and the UK, we could do the following:

left = canadian_youtube.set_index(['title', 'trending_date'])
right = british_youtube.set_index(['title', 'trending_date'])

left.join(right, lsuffix='_CAN', rsuffix='_UK').head(1)

#The lsuffix and rsuffix parameters are necessary here because the data has the same column names in both British and Canadian datasets. If this wasn't true (because, say, we'd renamed them beforehand) we wouldn't need them.

Unnamed: 0_level_0,Unnamed: 1_level_0,video_id_CAN,channel_title_CAN,category_id_CAN,publish_time_CAN,tags_CAN,views_CAN,likes_CAN,dislikes_CAN,comment_count_CAN,thumbnail_link_CAN,...,tags_UK,views_UK,likes_UK,dislikes_UK,comment_count_UK,thumbnail_link_UK,comments_disabled_UK,ratings_disabled_UK,video_error_or_removed_UK,description_UK
title,trending_date,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
!! THIS VIDEO IS NOTHING BUT PAIN !! | Getting Over It - Part 7,18.04.01,PNn8sECd7io,Markiplier,20,2018-01-03T19:33:53.000Z,"getting over it|""markiplier""|""funny moments""|""...",835930,47058,1023,8250,https://i.ytimg.com/vi/PNn8sECd7io/default.jpg,...,,,,,,,,,,


In [43]:
#Merging two CSV's
CSV_1.set_index("SomeCommonID").join(CSV_2.set_index("SomeCommonID"))


NameError: name 'CSV_1' is not defined

In [44]:
#Merging dataframes
pd.merge("DataFrame_1","DataFrame_2",on="SomeCommonID")

TypeError: Can only merge Series or DataFrame objects, a <class 'str'> was passed