# Pandas library and basic table data manipulation

Before any more complex analysis, it is necessary to learn the basics of working with processed data. Data can be of various natures - one-dimensional, two-dimensional, structured, unstructured, visual, audio ... In the lessons of exploratory data analysis, we will work with data mostly ** tabular ** - such as those you certainly know from your favorite (or unpopular) tabular processor (&quot;spreadsheet&quot;). Usually every ** row ** of such a table corresponds to some thing, an example of something, or some observation. In the individual ** columns ** there are then individual properties or measured quantities characteristic for these things.
In the Python world, the ** pandas ** library is most commonly used to process tabular data. It allows you to read data from many formats (including XLS (x) workbooks), edit them in various ways, calculate columns very efficiently, directly examine some statistical indicators and, last but not least, nicely visualize the results. This lesson will introduce you to the basic concepts used and teach you how to access individual columns, rows and cells.
You can find more about the pandas library on its homepage: https://pandas.pydata.org/

## Import the `pandas` library

In [1]:
import pandas as pd # We will access pandas using the alias pd

💡 Although this command imports the `pandas` module (or library), it will not be available under its usual name, but under the ** alias **` pd`. Conversely, the name `pandas` will no longer be defined. In normal programming, we try to avoid aliases because they reduce the readability of the code for other programmers. It&#39;s different with data analytics, because using one alias, which is also very common, saves us a lot of typing.

In [2]:
# pandas -&gt; Invoked by NameError

## Load the data table

We will jump straight into the pandas and show a typical example of data that we will process with this library.
To read data, Pandas has a number of `read_ *` functions that allow it to handle many different formats. The CSV (&quot;comma-separated values&quot; - [wiki] format (https://en.wikipedia.org/wiki/CSV)) is relatively common, in which each record corresponds to one line, the individual properties of the record are then separated by commas (or another character).
To work with this laptop, first download the data file from [this link] (static / pokemon.csv). Experimental data are generated from [Comprehensive Pokedex at Github] (https://github.com/veekun/pokedex).

In [3]:
tabulka_pokemonu = pd.read_csv("static/pokemon.csv")

The data (whatever it is) is now read into memory, referenced by the `pokemon_table` variable. Let&#39;s see what is hidden in them.

In [4]:
pokemon_table

Unnamed: 0,id,name,height,weight,color,shape,is baby,type 1,type 2,hp,attack,defense,speed
0,1,bulbasaur,0.7,6.9,green,quadruped,False,Grass,Poison,45,49,49,45
1,2,ivysaur,1.0,13.0,green,quadruped,False,Grass,Poison,60,62,63,60
2,3,venusaur,2.0,100.0,green,quadruped,False,Grass,Poison,80,82,83,80
3,4,charmander,0.6,8.5,red,upright,False,Fire,,39,52,43,65
4,5,charmeleon,1.1,19.0,red,upright,False,Fire,,58,64,58,80
...,...,...,...,...,...,...,...,...,...,...,...,...,...
802,803,poipole,0.6,1.8,purple,upright,False,Poison,,67,73,67,73
803,804,naganadel,3.6,150.0,purple,wings,False,Poison,Dragon,73,73,73,121
804,805,stakataka,5.5,820.0,gray,quadruped,False,Rock,Steel,61,131,211,13
805,806,blacephalon,1.8,13.0,white,humanoid,False,Fire,Ghost,53,127,53,107


If everything worked as it should, you should have a relatively nicely formatted table in front of you. The basic display in the laptop will show you the first five and last five rows (who would risk thousands of rows flooding the browser window?) In the table, along with information about the total number of rows and columns. In this case, the table contains a total of 13 properties (named columns) for 807 different Pokémon (numbered rows).
⚠️ ** Warning: ** In this simple case, the table was loaded correctly on the first try, without any specific parameters, all columns seem to contain usable values. This is (especially in the CSV format) actually out of hell luck. There are usually problems with the input data - for example, they do not have described columns (or they have them described strangely), they use special record delimiters or decimal parts of numbers, many rows are missing values (or are misspelled), ... ** Data cleaning ** we will dedicate sometime next time.
What exactly is the object stored in the `pokemon_table` variable? What is the class like?

## Base classes in `pandas` - DataFrame, Series, Index

In [5]:
type (table_poker)

pandas.core.frame.DataFrame

The answer is `DataFrame`. This term (also used in other popular statistical languages, such as [R] (https://www.r-project.org/)) is probably without the Czech equivalent, so we will continue to talk about tables or instances of the DataFrame class. Now let&#39;s try to dissect our first DataFrame.
⚠️ ** Warning: ** You will soon notice that the `DataFrame` has very similar functions to a spreadsheet workbook, but you need to know where this parallel ends. Unlike Excel or LibreOffice Calcu workbooks, the DataFrame contains &quot;only&quot; dry data, does not store any formatting, and does not offer an &quot;editor.&quot; A nice visual representation is just a matter of interacting `pandas` with a Jupyter laptop, or you can write your own code for it.

In [6]:
heights = pokemon_table [&quot;height&quot;]heights

0      0.7
1      1.0
2      2.0
3      0.6
4      1.1
      ... 
802    0.6
803    3.6
804    5.5
805    1.8
806    1.5
Name: height, Length: 807, dtype: float64

💡 The DataFrame behaves similarly to a dictionary (`dict`), among other things - when you put a key in square brackets, you get a column named this way. In fact, square brackets allow you to select from tables based on various other criteria, but we&#39;ll get to that.
Our autopsy goes on to find out what the `height` variable is.

In [7]:
type (heights)

pandas.core.series.Series

### Series

The columns are of the `Series` type, but we will not use this word either). This type looks like a list (`list`). We will check if it behaves this way:

In [8]:
height [0] # First height? ✓

0.7

In [9]:
heights [-5:] # Last five heights? ✓

802    0.6
803    3.6
804    5.5
805    1.8
806    1.5
Name: height, dtype: float64

** Task **: Try to apply some other list operations that you already know to `heights`. Sometimes it works, sometimes it doesn&#39;t.

There is also no problem between converting lists and `Series`. The easiest way you can create your own Series (remember that outside the table context) is to create an instance of this class with some list as an argument:

In [10]:
cisla = pd.Series ([1, 2, 3])numbers

0    1
1    2
2    3
dtype: int64

And vice versa:

In [11]:
cisla.tolist () # Variant 1 (preferred, faster)list (numbers) # Option 2

[1, 2, 3]

So how does the `Series` differ from the list, and what is its advantage?
In particular, each column has the following five basic properties:

<img src="static/series.svg" style="max-height: 20em;">

#### 1) values

In [12]:
vysky [: 50] .values # For aesthetic reasons, we will shorten the column a bit

array([0.7, 1. , 2. , 0.6, 1.1, 1.7, 0.5, 1. , 1.6, 0.3, 0.7, 1.1, 0.3,
       0.6, 1. , 0.3, 1.1, 1.5, 0.3, 0.7, 0.3, 1.2, 2. , 3.5, 0.4, 0.8,
       0.6, 1. , 0.4, 0.8, 1.3, 0.5, 0.9, 1.4, 0.6, 1.3, 0.6, 1.1, 0.5,
       1. , 0.8, 1.6, 0.5, 0.8, 1.2, 0.3, 1. , 1. , 1.5, 0.2])

In [13]:
type (vysky.values)

numpy.ndarray

💡 The values in `Series` are stored in a special format based on the` ndarray` type from the `numpy` library. We will not pay attention to this, but especially in the case of numerical values it saves memory space and speeds up mathematical operations (for example, adding up all values in a Series is significantly faster than in a list).

#### 2) type of values

In [14]:
vysky.dtype

dtype('float64')

💡 Unlike lists, all `Series` elements should be of the same type (if they are not, the next common supertype is selected). `pandas` has its own set of types, called ** dtypes **, which partially copies the default data types in Python, but (especially for numeric types) is closer to how the processor works with them. And don&#39;t look for heredity (good news?). We&#39;ll imagine the most common types next time - along with the operations that can be done with columns.

#### 3) index

In [15]:
vysky.index

RangeIndex(start=0, stop=807, step=1)

💡 You access the elements of the list in numerical order (0 - first element, 1 - second, ...), you select from the dictionary according to the key, pandas introduces a generalized ** index **, which can be numeric, string, but even built on date / time. See below for different indices.

#### 4) name

In [16]:
vysky.name

'height'

💡 `Series` may or may not have a name. Note that this is a value stored inside the object itself, it has nothing to do with the name of the variable in which you store it (but it will be used to access it for the column in the table).

#### 5) size

In [17]:
vysky.size

807

💡 This property tells you how many elements there are in the `Series`. It is not magical, it behaves like `len` next to the list (and after all,` len` can also be used on `Series`). For completeness, we state that, unlike other properties, this one is read-only.

** Task: ** Find the values of the `.name`,` .index`, `.dtype`,` .values` and `size` attributes for the` number` object. Do you notice anything interesting? Alternatively, look at the same for some of the other columns in `table_pokemon`.

When creating `Series` objects, these attributes (except` size` and to a limited extent `dtype`) can be explicitly specified:

In [18]:
vek = pd.Series(
    [27, 65, 14],
name = &quot;Age&quot;,index = [&quot;Karla&quot;, &quot;Martina&quot;, &quot;Žofie&quot;],    dtype=float,
)
century

Karla      27.0
Martina    65.0
Žofie      14.0
Name: Věk, dtype: float64

** Task: ** Create a Series object that will contain a list of colors, animals, numbers, or some other category of things you like.

## Index

By default, columns and tables use an unnamed numeric index, which sorts the elements one after the other from zero upwards:

In [19]:
vysky.index

RangeIndex(start=0, stop=807, step=1)

However, there are other types of indices (mostly enumerated):

In [20]:
vek.index

Index(['Karla', 'Martina', 'Žofie'], dtype='object')

In [21]:
events = pd.Series ([&quot;Independence of Czechoslovakia&quot;, &quot;End of World War II&quot;, &quot;Velvet Revolution&quot;],index = pd.Index ([1918, 1945, 1989], name = &quot;year&quot;) # Index can also have a name)
events

rok
1918    Nezávislost Československa
1945     Konec druhé světové války
1989             Sametová revoluce
dtype: object

In [22]:
events.index

Int64Index([1918, 1945, 1989], dtype='int64', name='rok')

This index is numerical, but the values are not (or are, but should not be) compared and are &quot;leaky&quot;.

In [23]:
events_precise = pd.Series ([&quot;Independence of Czechoslovakia&quot;, &quot;End of World War II&quot;, &quot;Velvet Revolution&quot;],    index = pd.DatetimeIndex(['1918-10-28', '1945-05-08', '1989-11-17'])
)
events_pres.index

DatetimeIndex(['1918-10-28', '1945-05-08', '1989-11-17'], dtype='datetime64[ns]', freq=None)

The index values can then be used in square brackets to access `Series` elements, similar to a dictionary. But the possibilities are much wider, we will show them in a short time in the context of DataFrame.

In [24]:
age [&quot;Martina&quot;]

65.0

** Task: ** What index does the Pokémon table have?

## DataFrame

<img src="static/df.svg" style="max-height: 25em;"/>

Once we are familiar with the columns and indexes, we can return to the table, respectively. `DataFrame`.
Just like the `Series` is a container of values associated with an index, the` DataFrame` is a two-dimensional container that, in addition to values (`.values`), contains two indexes - one for rows and one for columns:

In [25]:
pokemon_column.columns # Column list

Index(['id', 'name', 'height', 'weight', 'color', 'shape', 'is baby', 'type 1',
       'type 2', 'hp', 'attack', 'defense', 'speed'],
      dtype='object')

In [26]:
pokemon_table.index #Index (list of rows)

RangeIndex(start=0, stop=807, step=1)

In [27]:
tabulka_pokemonu.values

array([[1, 'bulbasaur', 0.7, ..., 49, 49, 45],
       [2, 'ivysaur', 1.0, ..., 62, 63, 60],
       [3, 'venusaur', 2.0, ..., 82, 83, 80],
       ...,
       [805, 'stakataka', 5.5, ..., 131, 211, 13],
       [806, 'blacephalon', 1.8, ..., 127, 53, 107],
       [807, 'zeraora', 1.5, ..., 112, 75, 143]], dtype=object)

In [28]:
pokemon_sheet.shape # Size (number of rows x number of columns)

(807, 13)

There are several ways to construct a new table (in addition to retrieving data from an external file), the most common of which are probably from a list of dictionaries or a dictionary of lists. As with `Series`, some attributes can be supplied as additional arguments.

In [29]:
pd.DataFrame({
&quot;number&quot;: [1, 2, 3],&quot;letter&quot;: [&quot;a&quot;, &quot;b&quot;, &quot;c&quot;]})

Unnamed: 0,cislo,pismeno
0,1,a
1,2,b
2,3,c


In [30]:
pd.DataFrame([
{&quot;name&quot;: &quot;butter&quot;, &quot;price&quot;: 42.90},{&quot;name&quot;: &quot;cheese&quot;, &quot;price&quot;: 31.90},{&quot;name&quot;: &quot;ketchup&quot;, &quot;price&quot;: 49.90},    ],
index = [&quot;article1&quot;, &quot;article2&quot;, &quot;article3&quot;])

Unnamed: 0,jmeno,cena
artikl1,máslo,42.9
artikl2,sýr,31.9
artikl3,kečup,49.9


** Task: ** Create a table (`DataFrame`) that will contain&quot; first name &quot;,&quot; last name &quot;and&quot; age &quot;columns for characters from one of your favorite novels or movies. You can, but you don&#39;t have to use an index on it.

## Indexing
Rows, columns, numerical order, keys, ranges ... Pandas sometimes behave like lists, sometimes like dictionaries. So how do you get value from them? There is a lot to do, so accessing parts of the table is not enough with simple square brackets `[]`.
For starters, let&#39;s adjust our Pokémon table to have an interesting and easy-to-grasp index. We will use two methods of the `DataFrame` class (both return a new` DataFrame` instance, derived from the instance we are calling them to):
* `set_index` returns a table in which one of the columns is used as an index
* `sort_index` returns a table that contains the same index but sorted

In [31]:
pokemoni = tabulka_pokemonu.set_index("name").sort_index()
pokemoni

Unnamed: 0_level_0,id,height,weight,color,shape,is baby,type 1,type 2,hp,attack,defense,speed
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
abomasnow,460,2.2,135.5,white,upright,False,Grass,Ice,90,92,75,60
abra,63,0.9,19.5,brown,upright,False,Psychic,,25,20,15,90
absol,359,1.2,47.0,white,quadruped,False,Dark,,65,130,60,75
accelgor,617,0.8,25.3,red,arms,False,Bug,,80,70,40,145
aegislash,681,1.7,53.0,brown,blob,False,Steel,Ghost,60,50,150,60
...,...,...,...,...,...,...,...,...,...,...,...,...
zoroark,571,1.6,81.1,gray,upright,False,Dark,,60,105,60,105
zorua,570,0.7,12.5,gray,quadruped,False,Dark,,40,65,40,65
zubat,41,0.8,7.5,purple,wings,False,Poison,Flying,40,45,35,55
zweilous,634,1.4,50.0,blue,quadruped,False,Dark,Dragon,72,85,70,58


In [32]:
pokemoni.index

Index(['abomasnow', 'abra', 'absol', 'accelgor', 'aegislash', 'aerodactyl',
       'aggron', 'aipom', 'alakazam', 'alomomola',
       ...
       'zapdos', 'zebstrika', 'zekrom', 'zeraora', 'zigzagoon', 'zoroark',
       'zorua', 'zubat', 'zweilous', 'zygarde'],
      dtype='object', name='name', length=807)

### `[]`

Let&#39;s start with square brackets:
* For `Series`, it returns the value to which the corresponding key belongs in the index (we showed this above).* For `DataFrame` returns a column with the appropriate name

In [33]:
pokemoni["height"]

name
abomasnow    2.2
abra         0.9
absol        1.2
accelgor     0.8
aegislash    1.7
            ... 
zoroark      1.6
zorua        0.7
zubat        0.8
zweilous     1.4
zygarde      5.0
Name: height, Length: 807, dtype: float64

If you put several values in the list in parentheses next to `DataFrame`, you will get more columns (and therefore` DataFrame`!):

In [34]:
pokemoni[["height", "weight"]]

Unnamed: 0_level_0,height,weight
name,Unnamed: 1_level_1,Unnamed: 2_level_1
abomasnow,2.2,135.5
abra,0.9,19.5
absol,1.2,47.0
accelgor,0.8,25.3
aegislash,1.7,53.0
...,...,...
zoroark,1.6,81.1
zorua,0.7,12.5
zubat,0.8,7.5
zweilous,1.4,50.0


** Task: ** What happens when you do the same with the `Series`?

** Task: ** Which of the last 5 pokemons (in alphabetical order) is the fastest?

### `.loc []`
When we want to get a row, we use the `loc` attribute, the so-called indexer. Be careful, this is not a method and square brackets are not used, but square ones. (There are reasons for this - this is the only way we can elegantly use abbreviated colon notation for ranges).

In [35]:
pokemoni.loc [&quot;open&quot;]

id              63
height         0.9
weight        19.5
color        brown
shape      upright
is baby      False
type 1     Psychic
type 2         NaN
hp              25
attack          20
defense         15
speed           90
Name: abra, dtype: object

We were interested in the row with the index &quot;abra&quot; and we got the expected result - `Series`, where each value is indexed by the name of the column.
However, the situation becomes interesting when we start using ranges in the index (remember that dictionaries can&#39;t do that):

In [36]:
pokemoni.loc [&quot;z&quot;:]

Unnamed: 0_level_0,id,height,weight,color,shape,is baby,type 1,type 2,hp,attack,defense,speed
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
zangoose,335,1.3,40.3,white,upright,False,Normal,,73,115,60,90
zapdos,145,1.6,52.6,yellow,wings,False,Electric,Flying,90,90,85,100
zebstrika,523,1.6,79.5,black,quadruped,False,Electric,,75,100,63,116
zekrom,644,2.9,345.0,black,upright,False,Dragon,Electric,100,150,120,90
zeraora,807,1.5,44.5,yellow,humanoid,False,Electric,,88,112,75,143
zigzagoon,263,0.4,17.5,brown,quadruped,False,Normal,,38,30,41,60
zoroark,571,1.6,81.1,gray,upright,False,Dark,,60,105,60,105
zorua,570,0.7,12.5,gray,quadruped,False,Dark,,40,65,40,65
zubat,41,0.8,7.5,purple,wings,False,Poison,Flying,40,45,35,55
zweilous,634,1.4,50.0,blue,quadruped,False,Dark,Dragon,72,85,70,58


Pandas intelligently understood that we wanted all the keys to some extent, even without them being present in the index.
⚠️ However, this can only be done with a sorted index. If the index is not sorted, the range of existing keys is selected in order, including both extremes, as follows:

In [37]:
pokemoni.loc [&quot;zangoose&quot;: &quot;zygarde&quot;]

Unnamed: 0_level_0,id,height,weight,color,shape,is baby,type 1,type 2,hp,attack,defense,speed
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
zangoose,335,1.3,40.3,white,upright,False,Normal,,73,115,60,90
zapdos,145,1.6,52.6,yellow,wings,False,Electric,Flying,90,90,85,100
zebstrika,523,1.6,79.5,black,quadruped,False,Electric,,75,100,63,116
zekrom,644,2.9,345.0,black,upright,False,Dragon,Electric,100,150,120,90
zeraora,807,1.5,44.5,yellow,humanoid,False,Electric,,88,112,75,143
zigzagoon,263,0.4,17.5,brown,quadruped,False,Normal,,38,30,41,60
zoroark,571,1.6,81.1,gray,upright,False,Dark,,60,105,60,105
zorua,570,0.7,12.5,gray,quadruped,False,Dark,,40,65,40,65
zubat,41,0.8,7.5,purple,wings,False,Poison,Flying,40,45,35,55
zweilous,634,1.4,50.0,blue,quadruped,False,Dark,Dragon,72,85,70,58


If you want to get to a specific value, you use two keys in square brackets in the order * row *, * column *.

In [38]:
pokemoni.loc [&quot;floor&quot;, &quot;color&quot;]

'gray'

But pay attention to the number of parentheses. If a list of keys appears in parentheses, all matching rows or values are selected in that dimension:

In [39]:
pokemoni.loc [[&quot;floor&quot;, &quot;bridge&quot;]]

Unnamed: 0_level_0,id,height,weight,color,shape,is baby,type 1,type 2,hp,attack,defense,speed
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
zorua,570,0.7,12.5,gray,quadruped,False,Dark,,40,65,40,65
zubat,41,0.8,7.5,purple,wings,False,Poison,Flying,40,45,35,55


Of course, approaches can (or not, of course?) Be combined, so you can select ranges and lists in rows and columns independently:

In [40]:
pokemoni.loc["j":"k", ["color", "attack"]]

Unnamed: 0_level_0,color,attack
name,Unnamed: 1_level_1,Unnamed: 2_level_1
jangmo-o,gray,55
jellicent,white,60
jigglypuff,pink,45
jirachi,yellow,100
jolteon,yellow,65
joltik,yellow,47
jumpluff,blue,55
jynx,red,50


** Task: ** What color are (all) Pokémon whose name starts with &quot;z&quot;?

** Task: ** How many pokemons exist with a name between the letters &quot;d&quot; and &quot;f&quot;?

** Task: ** From the list of all pokemons, you select 5 with a name you like (avoid the first and last five). What type are they? Which is the highest and which is the most difficult?

### `.iloc []`
If we want to forget for a moment what index is used for a table or column, we can access the elements directly through their order (row or column numbers). This is basically intuitive and corresponds to the indexing you are used to working with lists.

In [41]:
pokemoni.iloc [44]

id                15
height             1
weight          29.5
color         yellow
shape      bug-wings
is baby        False
type 1           Bug
type 2        Poison
hp                65
attack            90
defense           40
speed             75
Name: beedrill, dtype: object

In [42]:
pokemoni.iloc [-10:]

Unnamed: 0_level_0,id,height,weight,color,shape,is baby,type 1,type 2,hp,attack,defense,speed
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
zapdos,145,1.6,52.6,yellow,wings,False,Electric,Flying,90,90,85,100
zebstrika,523,1.6,79.5,black,quadruped,False,Electric,,75,100,63,116
zekrom,644,2.9,345.0,black,upright,False,Dragon,Electric,100,150,120,90
zeraora,807,1.5,44.5,yellow,humanoid,False,Electric,,88,112,75,143
zigzagoon,263,0.4,17.5,brown,quadruped,False,Normal,,38,30,41,60
zoroark,571,1.6,81.1,gray,upright,False,Dark,,60,105,60,105
zorua,570,0.7,12.5,gray,quadruped,False,Dark,,40,65,40,65
zubat,41,0.8,7.5,purple,wings,False,Poison,Flying,40,45,35,55
zweilous,634,1.4,50.0,blue,quadruped,False,Dark,Dragon,72,85,70,58
zygarde,718,5.0,305.0,green,squiggle,False,Dragon,Ground,108,100,121,95


Here, too, it is possible to combine. So when someone asks you for a value that is &quot;bottom left&quot;, you can try:

In [43]:
pokemoni.iloc [-1.0]

718

Finally, just for completeness, let&#39;s imagine three convenient functions that select the first, last or random rows from the table (all three have an optional parameter specifying the number of rows required):

In [44]:
pokemoni.head () # The first few lines

Unnamed: 0_level_0,id,height,weight,color,shape,is baby,type 1,type 2,hp,attack,defense,speed
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
abomasnow,460,2.2,135.5,white,upright,False,Grass,Ice,90,92,75,60
abra,63,0.9,19.5,brown,upright,False,Psychic,,25,20,15,90
absol,359,1.2,47.0,white,quadruped,False,Dark,,65,130,60,75
accelgor,617,0.8,25.3,red,arms,False,Bug,,80,70,40,145
aegislash,681,1.7,53.0,brown,blob,False,Steel,Ghost,60,50,150,60


In [45]:
pokemoni.tail () # Last few lines

Unnamed: 0_level_0,id,height,weight,color,shape,is baby,type 1,type 2,hp,attack,defense,speed
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
zoroark,571,1.6,81.1,gray,upright,False,Dark,,60,105,60,105
zorua,570,0.7,12.5,gray,quadruped,False,Dark,,40,65,40,65
zubat,41,0.8,7.5,purple,wings,False,Poison,Flying,40,45,35,55
zweilous,634,1.4,50.0,blue,quadruped,False,Dark,Dragon,72,85,70,58
zygarde,718,5.0,305.0,green,squiggle,False,Dragon,Ground,108,100,121,95


** Task: ** Can you write an equivalent of the `.tail ()` function using indexing?

In [46]:
pokemoni.sample () # Random line

Unnamed: 0_level_0,id,height,weight,color,shape,is baby,type 1,type 2,hp,attack,defense,speed
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
totodile,158,0.6,9.5,blue,upright,False,Water,,50,65,64,43


** Task (bonus): ** Can you write an equivalent to the `sample ()` function using indexing (and the `random` module)?

## Summary
In this lesson, we have shown three basic types of `pandas` library:    
* `Series` as a one-dimensional object containing values of the same type* `DataFrame` as a two-dimensional table composed of several` Series`* `Index` as a generalized description of how to access` Series` or `DataFrame` elements
In addition, we learned to select columns, rows, and individual values from tables.
In the next lesson we will show what data types (more precisely `dtypes`) can be used in` pandas`, we will start counting and imitating the functions of spreadsheets.

## ExercisesThe local zoo is considering investing in a new pavilion dedicated to Pokemona. But the zoo&#39;s director, Mr. Felix, is not sure if this investment would pay off and what it would all mean for the zoo. Someone advised him to invite you to help (we are not to blame, we swear - note the authors of the course). The director has compiled a list of questions he would like to know the answer to.0. (reload data from `pokemon.csv` file)1. For how many new animals would the zoo need food? The director would like one male and one female of each species (name).2. The marketing department is going to create new leaflets about Pokemons for zoo visitors. All Pokemons would need information about their height, weight, length, color and type. Is all the information available?3. The zoo considers it ideal for Pokemon to be delivered gradually, in groups of eight, as listed in the `pokemon.csv` table. Which Pokemon would be in the first, second, and last group?4. The operation of the zoo also drew attention to the special conditions necessary for the 3 highest Pokemons. In the table `pokemon.csv` they are in positions 207, 320 and 796, but no one remembers which Pokemons it was. What are their names?5. The director loves Onix. He would like to build a special prospect for him for the Stone Pokemons at speeds above 50. Would Onix like it there?6. The Director would also like to create a section for all Pokemons starting with &quot;i&quot;. But dark Pokemon cannot be with normal, fiery with water or grassy, and electric with psychic. Will it be possible to create this section? (tip: to display all Pokemons on &quot;i&quot; you will need to have the index sorted alphabetically)