Wojciech Łaguna  
[wojtek@laguna.pm](mailto:wojtek@laguna.pm)

Mateusz Rogowski

# Pandas - Wstęp, ładowanie i zapisywanie plików CSV, przeglądanie danych

![alt](https://cdn.shortpixel.ai/spai/w_300+q_lossy+ret_img+to_webp/https://www.numfocus.org/wp-content/uploads/2016/07/pandas-logo-300.png)

----
Sites:  
http://pandas.pydata.org/  
  
https://www.datacamp.com/community/tutorials/pandas-tutorial-dataframe-python


**Kaggle Titanic**  
https://www.kaggle.com/c/titanic  
https://www.dataquest.io/course/kaggle-competitions

In [1]:
import pandas as pd
import numpy as np

In [2]:
df = pd.read_csv('./data/titanic_train.csv')

In [3]:
df

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.2500,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.9250,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1000,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.0500,,S
...,...,...,...,...,...,...,...,...,...,...,...,...
886,887,0,2,"Montvila, Rev. Juozas",male,27.0,0,0,211536,13.0000,,S
887,888,1,1,"Graham, Miss. Margaret Edith",female,19.0,0,0,112053,30.0000,B42,S
888,889,0,3,"Johnston, Miss. Catherine Helen ""Carrie""",female,,1,2,W./C. 6607,23.4500,,S
889,890,1,1,"Behr, Mr. Karl Howell",male,26.0,0,0,111369,30.0000,C148,C


In [None]:
df.head(2)

In [None]:
df.tail(1)

In [None]:
df.sample(4, random_state=2022)

### Cechy (Features)  

- **PassengerId** - Wartość numeryczna id przypisana fo każdego pasażera.
- **Survived** - pasażer przeżył (1), lub nie przeżył (0). Klasa celu.
- **Pclass** - Klasa (1st = najwyższa, 2nd = średnia, 3rd = najniższa)
- **Name** - Imię i nzawisko pazażera
- **Sex** - Płeć
- **Age** - Wiek
- **SibSp** - Ilość krewnych podróżujących z pasażerem (rodzeństwo, żona, mąż, bez rodziców i dzieci)	
- **Parch** - Ilość rodziców lub dzieci podróżujących z pasażerem
- **Ticket** - Numer biletu
- **Fare** - Koszt bilet
- **Cabin** - Numer kajuty
- **Embarked** - Miejsce wejścia na pokład  (C = Cherbourg, Q = Queenstown, S = Southampton)

# Podstawowe informacje o zbiorze

In [None]:
df.shape

In [None]:
df.info()

In [None]:
df.describe().round(2)

In [4]:
df['Survived'].value_counts()

0    549
1    342
Name: Survived, dtype: int64

In [5]:
df['Survived'].value_counts(normalize=True)

0    0.616162
1    0.383838
Name: Survived, dtype: float64

In [6]:
df.columns

Index(['PassengerId', 'Survived', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp',
       'Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked'],
      dtype='object')

In [7]:
df.dtypes

PassengerId      int64
Survived         int64
Pclass           int64
Name            object
Sex             object
Age            float64
SibSp            int64
Parch            int64
Ticket          object
Fare           float64
Cabin           object
Embarked        object
dtype: object

# Wybieranie kolumn i wierszy

## Wybieranie kolumn

In [8]:
df.head(1)

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S


In [9]:
df['Pclass']

0      3
1      1
2      3
3      1
4      3
      ..
886    2
887    1
888    3
889    1
890    3
Name: Pclass, Length: 891, dtype: int64

In [10]:
df.Pclass

0      3
1      1
2      3
3      1
4      3
      ..
886    2
887    1
888    3
889    1
890    3
Name: Pclass, Length: 891, dtype: int64

In [11]:
df[['Cabin', 'Pclass']]

Unnamed: 0,Cabin,Pclass
0,,3
1,C85,1
2,,3
3,C123,1
4,,3
...,...,...
886,,2
887,B42,1
888,,3
889,C148,1


In [12]:
df[['Pclass']]

Unnamed: 0,Pclass
0,3
1,1
2,3
3,1
4,3
...,...
886,2
887,1
888,3
889,1


## Wybieranie zakresu wierszy

In [13]:
df[:2]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C


In [14]:
df[3:6]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S
5,6,0,3,"Moran, Mr. James",male,,0,0,330877,8.4583,,Q


In [15]:
df[:6:2]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


## Wybieranie po nazwach kolumn i indeksie - loc[]
---

http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-label

In [16]:
df.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


In [17]:
df.loc[2]

PassengerId                         3
Survived                            1
Pclass                              3
Name           Heikkinen, Miss. Laina
Sex                            female
Age                              26.0
SibSp                               0
Parch                               0
Ticket               STON/O2. 3101282
Fare                            7.925
Cabin                             NaN
Embarked                            S
Name: 2, dtype: object

In [18]:
row = df.loc[2]

In [19]:
row

PassengerId                         3
Survived                            1
Pclass                              3
Name           Heikkinen, Miss. Laina
Sex                            female
Age                              26.0
SibSp                               0
Parch                               0
Ticket               STON/O2. 3101282
Fare                            7.925
Cabin                             NaN
Embarked                            S
Name: 2, dtype: object

In [20]:
row['Name']

'Heikkinen, Miss. Laina'

In [21]:
df.loc[5:10]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
5,6,0,3,"Moran, Mr. James",male,,0,0,330877,8.4583,,Q
6,7,0,1,"McCarthy, Mr. Timothy J",male,54.0,0,0,17463,51.8625,E46,S
7,8,0,3,"Palsson, Master. Gosta Leonard",male,2.0,3,1,349909,21.075,,S
8,9,1,3,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27.0,0,2,347742,11.1333,,S
9,10,1,2,"Nasser, Mrs. Nicholas (Adele Achem)",female,14.0,1,0,237736,30.0708,,C
10,11,1,3,"Sandstrom, Miss. Marguerite Rut",female,4.0,1,1,PP 9549,16.7,G6,S


In [22]:
df.loc[9, 'Name']

'Nasser, Mrs. Nicholas (Adele Achem)'

In [23]:
df.loc[5:10, 'Name']

5                                      Moran, Mr. James
6                               McCarthy, Mr. Timothy J
7                        Palsson, Master. Gosta Leonard
8     Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)
9                   Nasser, Mrs. Nicholas (Adele Achem)
10                      Sandstrom, Miss. Marguerite Rut
Name: Name, dtype: object

In [24]:
# Gdy chcemy podejrzeć nazwy kolumn
df.columns

Index(['PassengerId', 'Survived', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp',
       'Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked'],
      dtype='object')

In [25]:
df.loc[5:10, :'Pclass']

Unnamed: 0,PassengerId,Survived,Pclass
5,6,0,3
6,7,0,1
7,8,0,3
8,9,1,3
9,10,1,2
10,11,1,3


In [26]:
df.loc[5:10, ['Survived', 'Name', 'Pclass']]

Unnamed: 0,Survived,Name,Pclass
5,0,"Moran, Mr. James",3
6,0,"McCarthy, Mr. Timothy J",1
7,0,"Palsson, Master. Gosta Leonard",3
8,1,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",3
9,1,"Nasser, Mrs. Nicholas (Adele Achem)",2
10,1,"Sandstrom, Miss. Marguerite Rut",3


## Wybieranie na podstawie kolejności (pozycji, a nie indeksu) - iloc[]
---

http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-integer

In [27]:
df.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


In [28]:
temp = df.sort_values('Pclass')
temp.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
445,446,1,1,"Dodge, Master. Washington",male,4.0,0,2,33638,81.8583,A34,S
310,311,1,1,"Hays, Miss. Margaret Bechstein",female,24.0,0,0,11767,83.1583,C54,C
309,310,1,1,"Francatelli, Miss. Laura Mabel",female,30.0,0,0,PC 17485,56.9292,E36,C
307,308,1,1,"Penasco y Castellana, Mrs. Victor de Satode (M...",female,17.0,1,0,PC 17758,108.9,C65,C
306,307,1,1,"Fleming, Miss. Margaret",female,,0,0,17421,110.8833,,C


In [29]:
temp.loc[:3]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
445,446,1,1,"Dodge, Master. Washington",male,4.0,0,2,33638,81.8583,A34,S
310,311,1,1,"Hays, Miss. Margaret Bechstein",female,24.0,0,0,11767,83.1583,C54,C
309,310,1,1,"Francatelli, Miss. Laura Mabel",female,30.0,0,0,PC 17485,56.9292,E36,C
307,308,1,1,"Penasco y Castellana, Mrs. Victor de Satode (M...",female,17.0,1,0,PC 17758,108.9000,C65,C
306,307,1,1,"Fleming, Miss. Margaret",female,,0,0,17421,110.8833,,C
...,...,...,...,...,...,...,...,...,...,...,...,...
796,797,1,1,"Leader, Dr. Alice (Farnham)",female,49.0,0,0,17465,25.9292,D17,S
815,816,0,1,"Fry, Mr. Richard",male,,0,0,112058,0.0000,B102,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
52,53,1,1,"Harper, Mrs. Henry Sleeper (Myna Haxtun)",female,49.0,1,0,PC 17572,76.7292,D33,C


In [30]:
temp2 = temp.iloc[:3]
temp2

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
445,446,1,1,"Dodge, Master. Washington",male,4.0,0,2,33638,81.8583,A34,S
310,311,1,1,"Hays, Miss. Margaret Bechstein",female,24.0,0,0,11767,83.1583,C54,C
309,310,1,1,"Francatelli, Miss. Laura Mabel",female,30.0,0,0,PC 17485,56.9292,E36,C


In [31]:
df.head(6)

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S
5,6,0,3,"Moran, Mr. James",male,,0,0,330877,8.4583,,Q


In [32]:
df.iloc[5]

PassengerId                   6
Survived                      0
Pclass                        3
Name           Moran, Mr. James
Sex                        male
Age                         NaN
SibSp                         0
Parch                         0
Ticket                   330877
Fare                     8.4583
Cabin                       NaN
Embarked                      Q
Name: 5, dtype: object

In [33]:
df.iloc[5:10]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
5,6,0,3,"Moran, Mr. James",male,,0,0,330877,8.4583,,Q
6,7,0,1,"McCarthy, Mr. Timothy J",male,54.0,0,0,17463,51.8625,E46,S
7,8,0,3,"Palsson, Master. Gosta Leonard",male,2.0,3,1,349909,21.075,,S
8,9,1,3,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27.0,0,2,347742,11.1333,,S
9,10,1,2,"Nasser, Mrs. Nicholas (Adele Achem)",female,14.0,1,0,237736,30.0708,,C


In [34]:
df.iloc[5, 'Name']  # Nie zadziała, można podawać tylko indeksy

ValueError: Location based indexing can only have [integer, integer slice (START point is INCLUDED, END point is EXCLUDED), listlike of integers, boolean array] types

In [35]:
df.iloc[5, 3]

'Moran, Mr. James'

In [36]:
df.iloc[5:10, 3]

5                                     Moran, Mr. James
6                              McCarthy, Mr. Timothy J
7                       Palsson, Master. Gosta Leonard
8    Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)
9                  Nasser, Mrs. Nicholas (Adele Achem)
Name: Name, dtype: object

In [37]:
df.loc[5:10, :'Name']

Unnamed: 0,PassengerId,Survived,Pclass,Name
5,6,0,3,"Moran, Mr. James"
6,7,0,1,"McCarthy, Mr. Timothy J"
7,8,0,3,"Palsson, Master. Gosta Leonard"
8,9,1,3,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)"
9,10,1,2,"Nasser, Mrs. Nicholas (Adele Achem)"
10,11,1,3,"Sandstrom, Miss. Marguerite Rut"


In [38]:
df.iloc[5:10, :4]

Unnamed: 0,PassengerId,Survived,Pclass,Name
5,6,0,3,"Moran, Mr. James"
6,7,0,1,"McCarthy, Mr. Timothy J"
7,8,0,3,"Palsson, Master. Gosta Leonard"
8,9,1,3,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)"
9,10,1,2,"Nasser, Mrs. Nicholas (Adele Achem)"


In [None]:
df.iloc[range(0, 50, 5), [1, 3]]

In [None]:
df.iloc[:50:5, [1, 3]]

In [39]:
df.iloc[-10:]

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
881,882,0,3,"Markun, Mr. Johann",male,33.0,0,0,349257,7.8958,,S
882,883,0,3,"Dahlberg, Miss. Gerda Ulrika",female,22.0,0,0,7552,10.5167,,S
883,884,0,2,"Banfield, Mr. Frederick James",male,28.0,0,0,C.A./SOTON 34068,10.5,,S
884,885,0,3,"Sutehall, Mr. Henry Jr",male,25.0,0,0,SOTON/OQ 392076,7.05,,S
885,886,0,3,"Rice, Mrs. William (Margaret Norton)",female,39.0,0,5,382652,29.125,,Q
886,887,0,2,"Montvila, Rev. Juozas",male,27.0,0,0,211536,13.0,,S
887,888,1,1,"Graham, Miss. Margaret Edith",female,19.0,0,0,112053,30.0,B42,S
888,889,0,3,"Johnston, Miss. Catherine Helen ""Carrie""",female,,1,2,W./C. 6607,23.45,,S
889,890,1,1,"Behr, Mr. Karl Howell",male,26.0,0,0,111369,30.0,C148,C
890,891,0,3,"Dooley, Mr. Patrick",male,32.0,0,0,370376,7.75,,Q


In [51]:
idx = pd.RangeIndex(5)
idx.get_slice_bound(3,'left')

TypeError: get_slice_bound() missing 1 required positional argument: 'kind'

Znajdywanie pozycji kolumny

In [52]:
df.columns

Index(['PassengerId', 'Survived', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp',
       'Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked'],
      dtype='object')

In [83]:
df.columns.get_slice_bound('Name', 'right','getitem')

4

In [61]:
df.iloc[:5, :df.columns.get_slice_bound('Name', 'left',None)]

Unnamed: 0,PassengerId,Survived,Pclass
0,1,0,3
1,2,1,1
2,3,1,3
3,4,1,1
4,5,0,3


# Tworzenie nowego zbioru z wybranymi kolumnami i wierszami

In [62]:
df_v2 = df[['Name', 'PassengerId', 'Survived']]

In [63]:
df_v2

Unnamed: 0,Name,PassengerId,Survived
0,"Braund, Mr. Owen Harris",1,0
1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",2,1
2,"Heikkinen, Miss. Laina",3,1
3,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",4,1
4,"Allen, Mr. William Henry",5,0
...,...,...,...
886,"Montvila, Rev. Juozas",887,0
887,"Graham, Miss. Margaret Edith",888,1
888,"Johnston, Miss. Catherine Helen ""Carrie""",889,0
889,"Behr, Mr. Karl Howell",890,1


In [64]:
df_v2 = df_v2[:20]

print("df_v2.shape", df_v2.shape)

df_v2

df_v2.shape (20, 3)


Unnamed: 0,Name,PassengerId,Survived
0,"Braund, Mr. Owen Harris",1,0
1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",2,1
2,"Heikkinen, Miss. Laina",3,1
3,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",4,1
4,"Allen, Mr. William Henry",5,0
5,"Moran, Mr. James",6,0
6,"McCarthy, Mr. Timothy J",7,0
7,"Palsson, Master. Gosta Leonard",8,0
8,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",9,1
9,"Nasser, Mrs. Nicholas (Adele Achem)",10,1


## Zapisywanie danych jako CSV

In [65]:
df_v2

Unnamed: 0,Name,PassengerId,Survived
0,"Braund, Mr. Owen Harris",1,0
1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",2,1
2,"Heikkinen, Miss. Laina",3,1
3,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",4,1
4,"Allen, Mr. William Henry",5,0
5,"Moran, Mr. James",6,0
6,"McCarthy, Mr. Timothy J",7,0
7,"Palsson, Master. Gosta Leonard",8,0
8,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",9,1
9,"Nasser, Mrs. Nicholas (Adele Achem)",10,1


In [66]:
df_v2.to_csv('./data/df_v2.csv', header=True, index=True)

## Odczyt danych z csv

In [67]:
df_from_file = pd.read_csv('./data/df_v2.csv', index_col=0)
df_from_file

Unnamed: 0,Name,PassengerId,Survived
0,"Braund, Mr. Owen Harris",1,0
1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",2,1
2,"Heikkinen, Miss. Laina",3,1
3,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",4,1
4,"Allen, Mr. William Henry",5,0
5,"Moran, Mr. James",6,0
6,"McCarthy, Mr. Timothy J",7,0
7,"Palsson, Master. Gosta Leonard",8,0
8,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",9,1
9,"Nasser, Mrs. Nicholas (Adele Achem)",10,1


# Ćwiczenia

### Załaduj plik `'titanic_train.csv'` do zmiennej `titanic`. Przypisz do nowej zmiennej (`titanic_2`) zbiór z kolumnami _PassengerId, Ticket, Fare_

In [87]:
titanic = pd.read_csv('./data/titanic_train.csv')
titanic_2 = titanic.loc[:,['PassengerId','Ticket','Fare']]
titanic_2

Unnamed: 0,PassengerId,Ticket,Fare
0,1,A/5 21171,7.2500
1,2,PC 17599,71.2833
2,3,STON/O2. 3101282,7.9250
3,4,113803,53.1000
4,5,373450,8.0500
...,...,...,...
886,887,211536,13.0000
887,888,112053,30.0000
888,889,W./C. 6607,23.4500
889,890,111369,30.0000


### Wybierz ostatnie 20 wierszy z zbioru `titanic_2` i ponownie zapisz je w tej samej zmiennej

In [77]:
titanic_last_20 = titanic.iloc[-10:]
titanic_last_20

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
881,882,0,3,"Markun, Mr. Johann",male,33.0,0,0,349257,7.8958,,S
882,883,0,3,"Dahlberg, Miss. Gerda Ulrika",female,22.0,0,0,7552,10.5167,,S
883,884,0,2,"Banfield, Mr. Frederick James",male,28.0,0,0,C.A./SOTON 34068,10.5,,S
884,885,0,3,"Sutehall, Mr. Henry Jr",male,25.0,0,0,SOTON/OQ 392076,7.05,,S
885,886,0,3,"Rice, Mrs. William (Margaret Norton)",female,39.0,0,5,382652,29.125,,Q
886,887,0,2,"Montvila, Rev. Juozas",male,27.0,0,0,211536,13.0,,S
887,888,1,1,"Graham, Miss. Margaret Edith",female,19.0,0,0,112053,30.0,B42,S
888,889,0,3,"Johnston, Miss. Catherine Helen ""Carrie""",female,,1,2,W./C. 6607,23.45,,S
889,890,1,1,"Behr, Mr. Karl Howell",male,26.0,0,0,111369,30.0,C148,C
890,891,0,3,"Dooley, Mr. Patrick",male,32.0,0,0,370376,7.75,,Q


### Sprawdź podstawowe statystyki zbioru `titanic_2` (info, describe)

In [78]:
titanic_2.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 3 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   PassengerId  891 non-null    int64  
 1   Ticket       891 non-null    object 
 2   Fare         891 non-null    float64
dtypes: float64(1), int64(1), object(1)
memory usage: 21.0+ KB


In [80]:
titanic_2.describe()

Unnamed: 0,PassengerId,Fare
count,891.0,891.0
mean,446.0,32.204208
std,257.353842,49.693429
min,1.0,0.0
25%,223.5,7.9104
50%,446.0,14.4542
75%,668.5,31.0
max,891.0,512.3292


### Zapisz zbiór `titanic_2` jako plik CSV `titanic_2.csv`

In [84]:
titanic_2.to_csv('titanic_2.csv', index=False)