# 4 Pandas

When dealing with numeric matrices and vectors in Python, NumPy makes life a lot easier. However, those used to work with dedicated languages like [R](https://www.r-project.org/), doing data analysis directly with NumPy feels like a step back. Fortunately, some nice folks have written the Python Data Analysis Library (a.k.a. [Pandas](http://pandas.pydata.org/)). Pandas provides an R-like DataFrame, produces high quality plots with matplotlib, and integrates nicely with other libraries that expect NumPy arrays.

Pandas works with `Series` of data, that then are arranged in `DataFrame` objects. A dataframe is the object closest to an Excel spreadsheet that we will see throughout the course. Dataframes, though, given that they are integrated in Python and can be combined with so many different packages, are much more powerful than simple Excel spreadsheets. The data in the series can be either qualitative or quantitative data and creating a series is as easy as creating a NumPy array from a one-dimensional list.

In [1]:
import numpy as np
import pandas as pd
print('Pandas:', pd.__version__)

Pandas: 1.1.5


In [2]:
animals = ['Tiger', 'Bear', 'Moose']
pd.Series(animals)

0    Tiger
1     Bear
2    Moose
dtype: object

In [3]:
numbers = [1, 2, 3]
pd.Series(numbers)

0    1
1    2
2    3
dtype: int64

Notice that the series is indexed by default by integers. We can change this indexing by using a dictionary instead of a list to create the series.

In [4]:
sports = {'Archery': 'Bhutan',
          'Golf': 'Scotland',
          'Sumo': 'Japan',
          'Taekwondo': 'South Korea'}
s = pd.Series(sports)
s

Archery           Bhutan
Golf            Scotland
Sumo               Japan
Taekwondo    South Korea
dtype: object

On the other hand, dataframes can be built from two-dimensional arrays, with the ability of labelling columns and indexing the rows. **Every column in a dataframe is a series**. 

In [5]:
# Sampling a 1000 rows 6 cols 2D array from the standard normal distribution and creating DataFrame
u = pd.DataFrame(np.random.randn(1000, 6),
                 index=np.arange(0, 3000, 3),
                 columns=['A', 'B', 'C', 'D', 'E', 'F'])

print(type(u))

u

<class 'pandas.core.frame.DataFrame'>


Unnamed: 0,A,B,C,D,E,F
0,1.529691,-0.538416,-0.244388,0.243894,-1.434319,-0.406120
3,1.129496,0.141477,-0.240872,0.365254,-1.635272,-0.947670
6,1.061280,-1.501800,2.063890,0.390767,0.410590,-1.533273
9,1.005475,-0.030160,-0.600576,-1.193379,0.154116,0.881702
12,0.224674,0.814185,1.080010,-0.326918,-0.806828,1.357435
...,...,...,...,...,...,...
2985,-0.756832,-0.154784,-1.314841,1.339179,1.188416,-0.788135
2988,2.006717,0.073238,-1.880424,0.183553,1.482888,0.000581
2991,0.157970,1.966254,-0.611997,0.331002,-0.584559,-0.048175
2994,-2.501087,1.129221,-0.282431,-0.557586,1.363579,1.360922


As you might have noticed, it is not the best to look at massive dataframes. There are some functions that allow us to have a nicer look at parts of the dataframe to have an idea of "how things are going".

In [6]:
u.head()

Unnamed: 0,A,B,C,D,E,F
0,1.529691,-0.538416,-0.244388,0.243894,-1.434319,-0.40612
3,1.129496,0.141477,-0.240872,0.365254,-1.635272,-0.94767
6,1.06128,-1.5018,2.06389,0.390767,0.41059,-1.533273
9,1.005475,-0.03016,-0.600576,-1.193379,0.154116,0.881702
12,0.224674,0.814185,1.08001,-0.326918,-0.806828,1.357435


In [7]:
u.tail()

Unnamed: 0,A,B,C,D,E,F
2985,-0.756832,-0.154784,-1.314841,1.339179,1.188416,-0.788135
2988,2.006717,0.073238,-1.880424,0.183553,1.482888,0.000581
2991,0.15797,1.966254,-0.611997,0.331002,-0.584559,-0.048175
2994,-2.501087,1.129221,-0.282431,-0.557586,1.363579,1.360922
2997,0.55497,0.131605,0.384242,-0.801872,-0.044222,-1.430762


In [8]:
u.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1000 entries, 0 to 2997
Data columns (total 6 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   A       1000 non-null   float64
 1   B       1000 non-null   float64
 2   C       1000 non-null   float64
 3   D       1000 non-null   float64
 4   E       1000 non-null   float64
 5   F       1000 non-null   float64
dtypes: float64(6)
memory usage: 54.7 KB


In [9]:
u.describe()

Unnamed: 0,A,B,C,D,E,F
count,1000.0,1000.0,1000.0,1000.0,1000.0,1000.0
mean,-0.026039,0.039695,0.004609,0.056757,0.005928,0.025661
std,1.010233,1.021984,1.024916,0.993804,0.981599,0.969133
min,-3.164641,-3.413782,-2.735776,-3.435198,-3.175102,-2.891552
25%,-0.725722,-0.646261,-0.715743,-0.625131,-0.643796,-0.637663
50%,-0.050568,0.046615,0.028489,0.044099,0.012637,0.026007
75%,0.659858,0.696789,0.701733,0.702746,0.664014,0.666779
max,3.346081,2.90429,3.526012,2.961118,3.029467,3.835209


### Indexing/Slicing in Pandas

The easiest way to access information in a Pandas dataframe, equivalent to the way used in NumPy, is using the `iloc` command. With `iloc` we can use the same indexing techniques that we saw with NumPy in the previous notebook.

In [10]:
# Slice-in rows index 125 to 132 (132 included!) from columns index 0, 2 and 5
u.iloc[125:132, [0, 2, 5]]

Unnamed: 0,A,C,F
375,0.72528,0.080836,-0.352246
378,1.389644,-0.739824,-0.745783
381,-0.059435,-0.057974,-1.683555
384,-0.970846,-0.952454,0.595984
387,-1.155227,0.431085,-0.239368
390,-0.726701,-0.279116,0.102658
393,-0.111332,1.695955,1.095941


We can choose specific columns according to their names using `loc` instead of `iloc`.

In [11]:
# Slice-in rows 375 to 393 (393 included!) from columns A, C and F
u.loc[375:393, ['A', 'C', 'F']]

Unnamed: 0,A,C,F
375,0.72528,0.080836,-0.352246
378,1.389644,-0.739824,-0.745783
381,-0.059435,-0.057974,-1.683555
384,-0.970846,-0.952454,0.595984
387,-1.155227,0.431085,-0.239368
390,-0.726701,-0.279116,0.102658
393,-0.111332,1.695955,1.095941


However, there are a few different ways of accessing the data in a Pandas dataframe, that typically have a more "direct" connection with the actual content fo the dataframe. Individual or sets of columns can also be accessed by their column names. Choosing one single column will give a Series, while two or more will produce a DataFrame

In [12]:
u['A'].head()

0     1.529691
3     1.129496
6     1.061280
9     1.005475
12    0.224674
Name: A, dtype: float64

In [13]:
u[['A', 'D']].head()

Unnamed: 0,A,D
0,1.529691,0.243894
3,1.129496,0.365254
6,1.06128,0.390767
9,1.005475,-1.193379
12,0.224674,-0.326918


Not only that, we can access a single column without the need of brackets []

In [14]:
u.A.head()

0     1.529691
3     1.129496
6     1.061280
9     1.005475
12    0.224674
Name: A, dtype: float64

Or, we can retrieve the elements that satisfy some condition

In [15]:
u[u.D > 2]

Unnamed: 0,A,B,C,D,E,F
21,-0.720298,1.038092,1.619139,2.018143,-0.056179,1.379509
120,-0.067738,-1.020308,-2.266636,2.206936,-0.590255,0.186909
162,-1.912364,-0.853718,1.50083,2.66451,-0.699474,0.944029
177,-1.594947,0.066061,0.238336,2.669839,-0.535638,1.21029
315,0.828884,-0.269063,0.353555,2.456021,0.598594,0.337546
483,0.809037,-0.960082,0.741255,2.262807,0.373416,-1.027757
552,2.461966,-0.044535,-0.875278,2.104515,1.480918,0.94849
714,1.527885,-0.983501,0.2322,2.230415,0.470912,1.496632
753,-0.297296,-0.284134,-0.172245,2.049252,0.705034,-0.412757
783,0.136426,0.279603,-0.635129,2.764491,1.914615,-0.599495


Dataframes provide the `query` functionality for the same purpose. While it is less powerful than boolean indexing, it is often faster and shorter (when names are longer than just `u`)

In [16]:
u.query('D > 2')

Unnamed: 0,A,B,C,D,E,F
21,-0.720298,1.038092,1.619139,2.018143,-0.056179,1.379509
120,-0.067738,-1.020308,-2.266636,2.206936,-0.590255,0.186909
162,-1.912364,-0.853718,1.50083,2.66451,-0.699474,0.944029
177,-1.594947,0.066061,0.238336,2.669839,-0.535638,1.21029
315,0.828884,-0.269063,0.353555,2.456021,0.598594,0.337546
483,0.809037,-0.960082,0.741255,2.262807,0.373416,-1.027757
552,2.461966,-0.044535,-0.875278,2.104515,1.480918,0.94849
714,1.527885,-0.983501,0.2322,2.230415,0.470912,1.496632
753,-0.297296,-0.284134,-0.172245,2.049252,0.705034,-0.412757
783,0.136426,0.279603,-0.635129,2.764491,1.914615,-0.599495


### Reshaping `DataFrame`

We can reshape and concatenate dataframes in a pretty similar way to numpy arrays. 

In [17]:
df1 = pd.DataFrame()

df1['sample'] = ['A', 'A', 'A', 'B', 'B', 'B']
df1['replicate'] = ['01', '02', '03', '01', '02', '03']
df1['protein'] = 'P02768'
df1['value1'] = np.random.randn(6)

df1

Unnamed: 0,sample,replicate,protein,value1
0,A,1,P02768,-0.111505
1,A,2,P02768,-1.652959
2,A,3,P02768,-0.374804
3,B,1,P02768,-0.625137
4,B,2,P02768,-1.615382
5,B,3,P02768,-1.309753


In [18]:
pivot_df1 = df1.pivot(index='replicate', columns='sample', values='value1')

pivot_df1.head()

sample,A,B
replicate,Unnamed: 1_level_1,Unnamed: 2_level_1
1,-0.111505,-0.625137
2,-1.652959,-1.615382
3,-0.374804,-1.309753


### Computing With `DataFrames`

We can calculate with `DataFrames` or their columns (which are `Series`) the same way we would work with numpy arrays.

In [19]:
df1['value2'] = 1 / df1['value1']
df1.head()

Unnamed: 0,sample,replicate,protein,value1,value2
0,A,1,P02768,-0.111505,-8.968197
1,A,2,P02768,-1.652959,-0.604976
2,A,3,P02768,-0.374804,-2.66806
3,B,1,P02768,-0.625137,-1.599649
4,B,2,P02768,-1.615382,-0.619049


In [20]:
np.mean(df1)

replicate    1.700502e+09
value1      -9.482567e-01
value2      -2.537239e+00
dtype: float64

We can also apply functions to the whole dataset or specific columns with the `apply` command. `apply` acts on the whole column at a time (i.e. a Pandas `Series`), so we can compute things that depend on several values of the column, for instance, the mean value. To apply functions in a real element-by-element basis the function `applymap` or `Series.apply` should be used.

In [21]:
def mean(col):
    return sum(col) / len(col)

df1[['value1', 'value2']].apply(mean)

value1   -0.948257
value2   -2.537239
dtype: float64

While most can be directly calculated (including the given example of the mean), `apply` also works on columns with strings or categorical data, where no mathematical operations are defined. The limit is the imagination.

### Combining `DataFrames`

Something we will do quite often as scientists is combining data from different sources into one single source. This can be achieved by different commands in Pandas, depending on the actual goal we want.

To begin with, appending new rows of data is achieved by the command `append`.

In [22]:
df2 = pd.DataFrame()

df2['sample'] = ['A', 'A', 'A', 'B', 'B', 'B']
df2['replicate'] = ['01', '02', '03', '01', '02', '03']
df2['protein'] = 'P69892'
df2['value1'] = np.random.randn(6)
df2['value2'] = 1 / df2['value1']

df2

Unnamed: 0,sample,replicate,protein,value1,value2
0,A,1,P69892,0.323275,3.093342
1,A,2,P69892,2.438003,0.410172
2,A,3,P69892,0.017873,55.951425
3,B,1,P69892,1.69584,0.589678
4,B,2,P69892,-0.094675,-10.562401
5,B,3,P69892,0.557293,1.794387


In [23]:
df1.append(df2, ignore_index=True)

Unnamed: 0,sample,replicate,protein,value1,value2
0,A,1,P02768,-0.111505,-8.968197
1,A,2,P02768,-1.652959,-0.604976
2,A,3,P02768,-0.374804,-2.66806
3,B,1,P02768,-0.625137,-1.599649
4,B,2,P02768,-1.615382,-0.619049
5,B,3,P02768,-1.309753,-0.763503
6,A,1,P69892,0.323275,3.093342
7,A,2,P69892,2.438003,0.410172
8,A,3,P69892,0.017873,55.951425
9,B,1,P69892,1.69584,0.589678


The same result can be obtained with `concat`.

In [24]:
df = pd.concat([df1, df2], ignore_index=True)

df

Unnamed: 0,sample,replicate,protein,value1,value2
0,A,1,P02768,-0.111505,-8.968197
1,A,2,P02768,-1.652959,-0.604976
2,A,3,P02768,-0.374804,-2.66806
3,B,1,P02768,-0.625137,-1.599649
4,B,2,P02768,-1.615382,-0.619049
5,B,3,P02768,-1.309753,-0.763503
6,A,1,P69892,0.323275,3.093342
7,A,2,P69892,2.438003,0.410172
8,A,3,P69892,0.017873,55.951425
9,B,1,P69892,1.69584,0.589678


### Grouping Data

In [25]:
df.groupby('protein').agg(sum)

Unnamed: 0_level_0,value1,value2
protein,Unnamed: 1_level_1,Unnamed: 2_level_1
P02768,-5.68954,-15.223434
P69892,4.937608,51.276602


In [26]:
df.groupby(['protein', 'sample']).agg(sum)

Unnamed: 0_level_0,Unnamed: 1_level_0,value1,value2
protein,sample,Unnamed: 2_level_1,Unnamed: 3_level_1
P02768,A,-2.139269,-12.241233
P02768,B,-3.550272,-2.982201
P69892,A,2.779151,59.454939
P69892,B,2.158458,-8.178336


In [27]:
df.groupby(['protein', 'sample', 'replicate']).agg(sum)

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,value1,value2
protein,sample,replicate,Unnamed: 3_level_1,Unnamed: 4_level_1
P02768,A,1,-0.111505,-8.968197
P02768,A,2,-1.652959,-0.604976
P02768,A,3,-0.374804,-2.66806
P02768,B,1,-0.625137,-1.599649
P02768,B,2,-1.615382,-0.619049
P02768,B,3,-1.309753,-0.763503
P69892,A,1,0.323275,3.093342
P69892,A,2,2.438003,0.410172
P69892,A,3,0.017873,55.951425
P69892,B,1,1.69584,0.589678


In [28]:
df.groupby('protein').transform(np.mean)

Unnamed: 0,replicate,value1,value2
0,1700502000.0,-0.948257,-2.537239
1,1700502000.0,-0.948257,-2.537239
2,1700502000.0,-0.948257,-2.537239
3,1700502000.0,-0.948257,-2.537239
4,1700502000.0,-0.948257,-2.537239
5,1700502000.0,-0.948257,-2.537239
6,1700502000.0,0.822935,8.5461
7,1700502000.0,0.822935,8.5461
8,1700502000.0,0.822935,8.5461
9,1700502000.0,0.822935,8.5461


In [29]:
df.groupby('protein')['value1', 'value2'].transform(np.mean)

  """Entry point for launching an IPython kernel.


Unnamed: 0,value1,value2
0,-0.948257,-2.537239
1,-0.948257,-2.537239
2,-0.948257,-2.537239
3,-0.948257,-2.537239
4,-0.948257,-2.537239
5,-0.948257,-2.537239
6,0.822935,8.5461
7,0.822935,8.5461
8,0.822935,8.5461
9,0.822935,8.5461


In [30]:
for g, g_df in df.groupby(['protein', 'sample']):
    print(g_df)
    print(f"{g} --> mean value1: {np.mean(g_df['value1'])}")
    print(f"      mean value2: {np.mean(g_df['value2'])}\n")

  sample replicate protein    value1    value2
0      A        01  P02768 -0.111505 -8.968197
1      A        02  P02768 -1.652959 -0.604976
2      A        03  P02768 -0.374804 -2.668060
('P02768', 'A') --> mean value1: -0.7130895303284994
      mean value2: -4.080410960967531

  sample replicate protein    value1    value2
3      B        01  P02768 -0.625137 -1.599649
4      B        02  P02768 -1.615382 -0.619049
5      B        03  P02768 -1.309753 -0.763503
('P02768', 'B') --> mean value1: -1.1834239532522366
      mean value2: -0.9940669307487835

  sample replicate protein    value1     value2
6      A        01  P69892  0.323275   3.093342
7      A        02  P69892  2.438003   0.410172
8      A        03  P69892  0.017873  55.951425
('P69892', 'A') --> mean value1: 0.9263835112471502
      mean value2: 19.81831294865032

   sample replicate protein    value1     value2
9       B        01  P69892  1.695840   0.589678
10      B        02  P69892 -0.094675 -10.562401
11      B 

In [31]:
df.groupby(['protein', 'sample']).describe()

Unnamed: 0_level_0,Unnamed: 1_level_0,value1,value1,value1,value1,value1,value1,value1,value1,value2,value2,value2,value2,value2,value2,value2,value2
Unnamed: 0_level_1,Unnamed: 1_level_1,count,mean,std,min,25%,50%,75%,max,count,mean,std,min,25%,50%,75%,max
protein,sample,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2
P02768,A,3.0,-0.71309,0.824529,-1.652959,-1.013882,-0.374804,-0.243155,-0.111505,3.0,-4.080411,4.356825,-8.968197,-5.818129,-2.66806,-1.636518,-0.604976
P02768,B,3.0,-1.183424,0.507066,-1.615382,-1.462567,-1.309753,-0.967445,-0.625137,3.0,-0.994067,0.5294,-1.599649,-1.181576,-0.763503,-0.691276,-0.619049
P69892,A,3.0,0.926384,1.317977,0.017873,0.170574,0.323275,1.380639,2.438003,3.0,19.818313,31.320939,0.410172,1.751757,3.093342,29.522384,55.951425
P69892,B,3.0,0.719486,0.90621,-0.094675,0.231309,0.557293,1.126567,1.69584,3.0,-2.726112,6.813105,-10.562401,-4.986362,0.589678,1.192032,1.794387


In [32]:
df.pivot_table(index='protein',
               columns='sample', 
               aggfunc='mean')

Unnamed: 0_level_0,value1,value1,value2,value2
sample,A,B,A,B
protein,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2
P02768,-0.71309,-1.183424,-4.080411,-0.994067
P69892,0.926384,0.719486,19.818313,-2.726112


In [33]:
df.pivot_table(index='protein',
               columns='sample',
               aggfunc={'value1': min,
                        'value2': max})

Unnamed: 0_level_0,value1,value1,value2,value2
sample,A,B,A,B
protein,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2
P02768,-1.652959,-1.615382,-0.604976,-0.619049
P69892,0.017873,-0.094675,55.951425,1.794387


### Loading and saving dataframes

To load and save Pandas dataframes we will use the `to_csv` and `read_csv` commands. Whenever the dataframe does not contain any kind of column that is of type `object` we can also use feather format with `to_feather`. In case we have objects in the cells, such as functions, for example, we can use pickle format with `to_pickle`. 

In [34]:
df.to_csv('test.csv')
pd.read_csv('test.csv', index_col=0)

Unnamed: 0,sample,replicate,protein,value1,value2
0,A,1,P02768,-0.111505,-8.968197
1,A,2,P02768,-1.652959,-0.604976
2,A,3,P02768,-0.374804,-2.66806
3,B,1,P02768,-0.625137,-1.599649
4,B,2,P02768,-1.615382,-0.619049
5,B,3,P02768,-1.309753,-0.763503
6,A,1,P69892,0.323275,3.093342
7,A,2,P69892,2.438003,0.410172
8,A,3,P69892,0.017873,55.951425
9,B,1,P69892,1.69584,0.589678


But, as an addition, Pandas has special commands to load and save Excel spreadsheets (yay!). However, to use it you'll need the `openpyxl` and `xlrd` packages.

In [35]:
df.to_excel('test.xlsx', sheet_name='My sheet')
pd.read_excel('test.xlsx', 'My sheet', index_col=0)

Unnamed: 0,sample,replicate,protein,value1,value2
0,A,1,P02768,-0.111505,-8.968197
1,A,2,P02768,-1.652959,-0.604976
2,A,3,P02768,-0.374804,-2.66806
3,B,1,P02768,-0.625137,-1.599649
4,B,2,P02768,-1.615382,-0.619049
5,B,3,P02768,-1.309753,-0.763503
6,A,1,P69892,0.323275,3.093342
7,A,2,P69892,2.438003,0.410172
8,A,3,P69892,0.017873,55.951425
9,B,1,P69892,1.69584,0.589678


**Exercise 5**: Download [this dataset](https://raw.githubusercontent.com/ChihChengLiang/pokemongor/master/data-raw/pokemons.csv) and load it, using the first column as the index. Take a look at it, and do the following things:
- Choose the columns 'Identifier', 'BaseStamina', 'BaseAttack', 'BaseDefense', 'Type1' and 'Type2' 
- Create a function that lowercases strings and apply it to 'Type1' and 'Type2' (*Extra: just capitalize the strings, i.e., leave the first letter uppercase and lowercase the rest*)
- Create a function that returns a Boolean value (don't be afraif by this, it is a function that returns either True or False) that tells if a Pokémon has high stamina (BaseStamina>170) or not. Store this information in a new column and show the list of Pokémon with high stamina
- Show the instructor the last 15 rows of your dataset

In [36]:
df = pd.read_csv('https://raw.githubusercontent.com/ChihChengLiang/pokemongor/master/data-raw/pokemons.csv', 
                 index_col=0)

df = df[['Identifier', 'BaseStamina', 'BaseAttack', 'BaseDefense', 'Type1', 'Type2']]

capitalize = lambda st: st.capitalize()

for col in ['Type1', 'Type2']:
    df[col] = df[col].apply(capitalize)
    
def highstamina(x):
    return True if x > 170 else False

df['HighStamina'] = df.BaseStamina.apply(highstamina)

print(df[df['HighStamina'] == True].Identifier)

df.tail(15)

PkMn
31      Nidoqueen
36       Clefable
39     Jigglypuff
40     Wigglytuff
59       Arcanine
62      Poliwrath
68        Machamp
79       Slowpoke
80        Slowbro
87        Dewgong
89            Muk
103     Exeggutor
108     Lickitung
112        Rhydon
113       Chansey
115    Kangaskhan
130      Gyarados
131        Lapras
134      Vaporeon
143       Snorlax
144      Articuno
145        Zapdos
146       Moltres
149     Dragonite
150        Mewtwo
151           Mew
Name: Identifier, dtype: object


Unnamed: 0_level_0,Identifier,BaseStamina,BaseAttack,BaseDefense,Type1,Type2,HighStamina
PkMn,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
137,Porygon,130,156,158,Normal,,False
138,Omanyte,70,132,160,Rock,Water,False
139,Omastar,140,180,202,Rock,Water,False
140,Kabuto,60,148,142,Rock,Water,False
141,Kabutops,120,190,190,Rock,Water,False
142,Aerodactyl,160,182,162,Rock,Flying,False
143,Snorlax,320,180,180,Normal,,True
144,Articuno,180,198,242,Ice,Flying,True
145,Zapdos,180,232,194,Electric,Flying,True
146,Moltres,180,242,194,Fire,Flying,True
