# 2019 ADM Laboratory

1. Introduction to Python, Pandas, Datasets
2. Visualization
3. Document Databases, JSON, Mongo
4. Web Scrapping, BeautifulSoup
5. Data Exchange, REST API, Social Networks, Twitter
6. Amazon Web Services
7. Amazon Web Services: MapReduce Paradigm
8. PageRank in MapReduce
9. Clustering, SKLearn
10. Python Pipeline

## Virtual Machine

    VirtualBox
    XUbuntu 18.04
    Python 3
    Jupyter/iPython notebook
    IntelliJ pyCharm
    Pandas, BeautifulSoup, Scilearnkit, networkx, pymongo, pyspark and matplotlib
    MongoDB
    Apache Spark with Hadoop

## Online Services
    mLab
    Amazon Web Services

# Part 1 - Python Basic & Compound Data Types

Python supports decimals (called integers) of any length, and reals (called floats) that support up to 15 decimal places.

In [1]:
number = 42
type(number)

int

In Jupyter we can mix python code, like the cell above. Notice the label "In [number]" and the corresponding "Out [number]". The number represents the order of execution of the specific piece of code.

Within a code block we can query for the value of a variable, like the cell below.

We can also mix comments and text, in cells called "Markedown", like this one.

In [2]:
number

42

In [3]:
decimal = 3.14159265358979323846264338327950288419716939937510
type(decimal)
print(decimal)

3.141592653589793


Store is stored is called a 'string' variable.

In [4]:
text = 'This is my text'
another = "We can also use double-quotes"
mix = "Mix is 'simple'"

In [5]:
print(text)
print(another)
print(mix)

This is my text
We can also use double-quotes
Mix is 'simple'


In [6]:
mix

"Mix is 'simple'"

In [7]:
text

'This is my text'

The slicing operator [ ] can be used with string.

In [8]:
text[2:4]

'is'

In [9]:
text[:-5]

'This is my'

In [10]:
text[5:]

'is my text'

In [11]:
text[5]

'i'

Notice the different formating of the contents of the variable when we use the __print__ command and when using the jupyter environment to query for the value.

Python support compound data types such as tupples, lists, sets and dictionaries.

### Tuples 

A tuple is a collection which is ordered and unchangeable.

In [12]:
thistuple = (42, "kiwi", 14.911)
type(thistuple)

tuple

In [13]:
thistuple

(42, 'kiwi', 14.911)

We can access element **i** of the tuple using the [**i**]. The first element's index is 0, that is, i=0.

In [14]:
thistuple[0]

42

In [15]:
print(thistuple[2], thistuple[1])

14.911 kiwi


In [16]:
thistuple[3]

IndexError: tuple index out of range

In [17]:
thistuple[0] = 9.871

TypeError: 'tuple' object does not support item assignment

In [18]:
thistuple = (11, 'carrot', 9.871)

In [19]:
thistuple

(11, 'carrot', 9.871)

### Lists

Python provides List to store sequences of values Lists in python are dynamic: **They grow/shrink on demand**.
Lists are mutable: **Values can change on demand** and also **Data type of individual items can change**.

In [20]:
lst = [1,5,15,7]
print(lst)

[1, 5, 15, 7]


In [21]:
lst[2] = 22
lst

[1, 5, 22, 7]

In [22]:
lst[1] = 'Hello'
lst

[1, 'Hello', 22, 7]

In [23]:
lst[4] = 33

IndexError: list assignment index out of range

In [24]:
lst.append(33)

In [25]:
lst

[1, 'Hello', 22, 7, 33]

Lists operations enable repetition, concatenation, slicing, iteration, checking for membership, ...

In [26]:
lst * 2

[1, 'Hello', 22, 7, 33, 1, 'Hello', 22, 7, 33]

In [27]:
lst + [55, "kiwi"]

[1, 'Hello', 22, 7, 33, 55, 'kiwi']

In [28]:
lst

[1, 'Hello', 22, 7, 33]

New lists can be created from manipulating existing ones

In [29]:
newlst = lst * 2

In [30]:
newlst

[1, 'Hello', 22, 7, 33, 1, 'Hello', 22, 7, 33]

In [31]:
len(newlst)

10

In [32]:
newlst[1:4]

['Hello', 22, 7]

In [33]:
newlst[:4]

[1, 'Hello', 22, 7]

In [34]:
newlst[:-2]

[1, 'Hello', 22, 7, 33, 1, 'Hello', 22]

In [35]:
newlst[2:]

[22, 7, 33, 1, 'Hello', 22, 7, 33]

In [36]:
33 in newlst

True

In [37]:
44 in newlst

False

In [38]:
newlst.count(33)

2

In [39]:
for value in lst: 
    print(value)

1
Hello
22
7
33


Some other operators make changes on the list itself

In [40]:
values = [5, 1, 88, 3]

In [41]:
values

[5, 1, 88, 3]

In [42]:
values.reverse()

In [43]:
values

[3, 88, 1, 5]

In [44]:
values.sort()

In [45]:
values

[1, 3, 5, 88]

In [46]:
values.insert(2, 101) # Insert 101 into list at index 2, that is at the 3rd position.

In [47]:
values

[1, 3, 101, 5, 88]

In [48]:
values.pop(2) # Deletes the ith element of the list and returns its value.

101

In [49]:
values

[1, 3, 5, 88]

In [50]:
newlst.index(33) # Returns index of first occurrence of 33.

4

In [51]:
newlst.index(88)

ValueError: 88 is not in list

In [52]:
newlst.remove(33) # Deletes the first occurrence of 33 in list.

In [53]:
newlst

[1, 'Hello', 22, 7, 1, 'Hello', 22, 7, 33]

In [54]:
newlst.index(33)

8

Lists can contain tuples.

In [55]:
data = [("julius", 3),
("maria", 2), 
("alice", 4),
("maria", 1)]

In [56]:
data

[('julius', 3), ('maria', 2), ('alice', 4), ('maria', 1)]

In [57]:
for (n, a) in data:
    print("I met %s %s times" % (n, a) )

I met julius 3 times
I met maria 2 times
I met alice 4 times
I met maria 1 times


In [58]:
for x in data:
    print("I met %s %s times" % (x[0], x[1]) )

I met julius 3 times
I met maria 2 times
I met alice 4 times
I met maria 1 times


In [59]:
data.sort()

In [60]:
data

[('alice', 4), ('julius', 3), ('maria', 1), ('maria', 2)]

### Sets

A set is a collection of unique values that unordered are unindexed.

In [61]:
myset = {'alice', 'julius', 'maria', 'maria'}

In [62]:
myset

{'alice', 'julius', 'maria'}

In [63]:
myset[1]

TypeError: 'set' object does not support indexing

In [64]:
myset.sort()

AttributeError: 'set' object has no attribute 'sort'

In [65]:
myset.append('cornelia')

AttributeError: 'set' object has no attribute 'append'

In [66]:
myset.add('cornelia')

In [67]:
myset

{'alice', 'cornelia', 'julius', 'maria'}

In [68]:
myset.add(3)

In [69]:
print(myset)

{'maria', 3, 'julius', 'cornelia', 'alice'}


In [70]:
myset

{3, 'alice', 'cornelia', 'julius', 'maria'}

In [71]:
'alice' in myset

True

In [72]:
for values in myset:
    print(values)

maria
3
julius
cornelia
alice


In [73]:
len(myset)

5

### Dictionaries

Are lookup tables that map a **key** to a **value**. The keys of a dictionary form a Set. Thus duplicate keys are not allowed.

In [74]:
cities= {'A': 'Ancona',
'B': 'Bary',
'C': 'Como'}

In [75]:
cities

{'A': 'Ancona', 'B': 'Bary', 'C': 'Como'}

In [76]:
cities['A']

'Ancona'

In [77]:
cities.get('A')

'Ancona'

In [78]:
cities['X']

KeyError: 'X'

In [79]:
cities.get('X','unknown')

'unknown'

In [81]:
cities['D'] = 'Domodossola'

In [82]:
cities['D']

'Domodossola'

Values can be of any type

In [83]:
cities['E'] = 42

In [84]:
cities

{'A': 'Ancona', 'B': 'Bary', 'C': 'Como', 'D': 'Domodossola', 'E': 42}

Keys can be of any data type

In [85]:
cities[42] = 'E'

In [86]:
cities

{42: 'E', 'A': 'Ancona', 'B': 'Bary', 'C': 'Como', 'D': 'Domodossola', 'E': 42}

In [87]:
cities.pop(42)

'E'

In [88]:
cities

{'A': 'Ancona', 'B': 'Bary', 'C': 'Como', 'D': 'Domodossola', 'E': 42}

In [89]:
del cities['E']

In [90]:
cities

{'A': 'Ancona', 'B': 'Bary', 'C': 'Como', 'D': 'Domodossola'}

In [91]:
len(cities)

4

In [92]:
nobel = {
(1979, "physics"): ["Glashow", "Salam", "Weinberg"],
(1962, "chemistry"): ["Hodgkin"],
(1984, "biology"): ["McClintock"],
}

In [93]:
nobel

{(1962, 'chemistry'): ['Hodgkin'],
 (1979, 'physics'): ['Glashow', 'Salam', 'Weinberg'],
 (1984, 'biology'): ['McClintock']}

In [94]:
for key in nobel:
    print(key)

(1979, 'physics')
(1962, 'chemistry')
(1984, 'biology')


In [95]:
for key in nobel:
    print(nobel[key])

['Glashow', 'Salam', 'Weinberg']
['Hodgkin']
['McClintock']


In [96]:
for (key, value) in nobel.items():
    print(key, value)

(1979, 'physics') ['Glashow', 'Salam', 'Weinberg']
(1962, 'chemistry') ['Hodgkin']
(1984, 'biology') ['McClintock']


In [97]:
nobel.values()

dict_values([['Glashow', 'Salam', 'Weinberg'], ['Hodgkin'], ['McClintock']])

In [98]:
nobel.keys()

dict_keys([(1979, 'physics'), (1962, 'chemistry'), (1984, 'biology')])

# Part 2 - Pandas

An library with useful tools for data engineering.

In [99]:
import numpy as np
import pandas as pd

## Series
Represent a list of values.

In [100]:
s = pd.Series([1, 3, 5, np.nan, 6, 8])

In [101]:
s

0    1.0
1    3.0
2    5.0
3    NaN
4    6.0
5    8.0
dtype: float64

**Jupyter** does not enforce the flow of execution. Code sniplets can be executed in any order, possibly changing the state of the variables used.

In [102]:
myset

{3, 'alice', 'cornelia', 'julius', 'maria'}

In [103]:
myindex = [value for value in myset]

In [104]:
myindex

['maria', 3, 'julius', 'cornelia', 'alice']

In [105]:
myseries = pd.Series(np.random.randn(5), index = myindex)

In [106]:
myseries

maria      -0.601428
3          -0.004417
julius     -0.747530
cornelia    0.084172
alice       0.590208
dtype: float64

Remark here **randn** returns a sample (or samples) from the “standard normal” distribution.

https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.random.randn.html

In [107]:
myseries['pitt'] = 1.5

In [108]:
myseries

maria      -0.601428
3          -0.004417
julius     -0.747530
cornelia    0.084172
alice       0.590208
pitt        1.500000
dtype: float64

In [109]:
myseries['julius']

-0.7475296432987291

In [110]:
myseries.max()

1.5

In [111]:
myseries.sum()

0.8210045925335643

In [112]:
nobel

{(1962, 'chemistry'): ['Hodgkin'],
 (1979, 'physics'): ['Glashow', 'Salam', 'Weinberg'],
 (1984, 'biology'): ['McClintock']}

In [113]:
nobelseries = pd.Series(nobel)

In [114]:
nobelseries

1979  physics      [Glashow, Salam, Weinberg]
1962  chemistry                     [Hodgkin]
1984  biology                    [McClintock]
dtype: object

In [115]:
nobelseries[(1984,'biology')]

['McClintock']

## DataFrames

A 2-dimensional labeled data structure with columns of potentially different types.

In [116]:
seriesA = pd.Series([1., 2., 3.], index=['a', 'b', 'c'])
seriesB = pd.Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])

In [117]:
d = {'Series A': seriesA, 'Series B': seriesB}

In [118]:
frame = pd.DataFrame(d)

In [119]:
frame

Unnamed: 0,Series A,Series B
a,1.0,1.0
b,2.0,2.0
c,3.0,3.0
d,,4.0


In [120]:
frame['Series A']

a    1.0
b    2.0
c    3.0
d    NaN
Name: Series A, dtype: float64

In [121]:
frame['Series A']['b']

2.0

In [122]:
frame.iloc[2]

Series A    3.0
Series B    3.0
Name: c, dtype: float64

In [123]:
frame = pd.DataFrame(d, index=['d', 'b', 'a'])

In [124]:
frame

Unnamed: 0,Series A,Series B
d,,4.0
b,2.0,2.0
a,1.0,1.0


In [125]:
frame = pd.DataFrame(d, index=['d', 'b', 'a'], columns=['Series B', 'Series C'])

In [126]:
frame

Unnamed: 0,Series B,Series C
d,4.0,
b,2.0,
a,1.0,


In [127]:
frame['Series B'] > 2.0

d     True
b    False
a    False
Name: Series B, dtype: bool

In [128]:
frame['Series B']['b'] = 2.5

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [129]:
frame['Series B'] > 2.0

d     True
b     True
a    False
Name: Series B, dtype: bool

In [130]:
frame.insert(1, 'copy B', frame['Series B'])

In [131]:
frame

Unnamed: 0,Series B,copy B,Series C
d,4.0,4.0,
b,2.5,2.5,
a,1.0,1.0,


In [132]:
frame.loc[frame['Series B'] > 2.0]

Unnamed: 0,Series B,copy B,Series C
d,4.0,4.0,
b,2.5,2.5,


# Part 3 - Open Data
https://dati.comune.roma.it/
Portal of Open Data of City of Rome
### Strutture ricettive gennaio 2019
Provides a list of the hotels, hostels and in general all the structures that can receive guests within Roma that are active during January 2019.

https://dati.comune.roma.it/catalog/dataset/d823/resource/9964559d-0a9b-4dd6-a417-eb1ed019ab59

In [134]:
dataset = pd.read_csv('../data/opendata_suar_gennaio.csv', sep=',', delimiter=None, header='infer',
names=None, index_col=None, usecols=None, encoding = "ISO-8859-1", nrows=20)

The dataset uses a total of 25 features to describe each entry. Each one of these features are a single column of the DataFrame. Each rows includes the information for each entity.

In [136]:
dataset.columns

Index(['Insegna', 'Classificazione', 'Indirizzo', 'Municipio', 'Tipologia',
       'Singole', 'Doppie', 'Triple', 'Quadruple', 'Quintuple', 'Sestuple',
       'Unitaâ Abitative', 'Posti Letto Unitaâ Abitative', 'Sito Web',
       'Email', 'Telefono', 'Fax', 'Cellulare', 'Contatto Facebook',
       'Contatto Twitter', 'Contatto Instagram', 'Contatto Altro Social',
       'Unnamed: 22', 'Unnamed: 23', 'Unnamed: 24'],
      dtype='object')

The Rome Open Data Portal includes a [Data Dictionary](https://en.wikipedia.org/wiki/Data_dictionary) at the bottom of the web page of each dataset. For this particular dataset, the data dictionary is as follows:

|**Column** | **Type** | **Descrizione**      |
|-----------|:--------:|---------------------:|
|Insegna    | text     | Name of the strucute |
| Classificazione |	text | Category of structure |
| Indirizzo | text | The street where the structure is located |
| Municipio | text | The number of the street |
| Tipologia | text | [The administrative subdivision of Rome](https://en.wikipedia.org/wiki/Administrative_subdivision_of_Rome) |
| Singole | text | Number of single-bed rooms |
| Doppie | numeric | Number of double-bed rooms |
| Triple | numeric | Number of three-bed rooms |
| Quadruple | numeric | Number of four-bed rooms |
| Quintuple | numeric | Number of five-bed rooms |
| Sestuple | numeric | Number of six-bed rooms |
| Unita’ Abitative | numeric | Number of housing units |
| Posti Letto Unita’ Abitative | numeric | Number of beds |
| Sito Web | numeric | Not used |
| Email | text | Not used |
| Telefono | text | Not used |
| Fax | numeric | Not used |
| Cellulare | text | Not used |
| Contatto Facebook | numeric | Not used |	
| Contatto Twitter | text | Not used |
| Contatto Instagram | text | Not used |
| Contatto Altro Social | text | Not used |

In [137]:
dataset.head()

Unnamed: 0,Insegna,Classificazione,Indirizzo,Municipio,Tipologia,Singole,Doppie,Triple,Quadruple,Quintuple,...,Telefono,Fax,Cellulare,Contatto Facebook,Contatto Twitter,Contatto Instagram,Contatto Altro Social,Unnamed: 22,Unnamed: 23,Unnamed: 24
0,"Casa e Appartamento per Vacanze ""Trastevere_ho...",Categoria 2,VIA FEDERICO ROSAZZA,52,Mun. XII,Casa Vacanze NON impr (Appartamento),1.0,,,,...,,,,,,,,,,
1,"Casa e Appartamento per Vacanze ""VIVES 63""",Categoria 1,VIA DI MONTE VERDE,63,Mun. XII,Casa Vacanze NON impr (Appartamento),,1.0,,,...,,,,,,,,,,
2,"CASA E APPARTAMENTO PER VACANZE ""LA CASA DI FA...",Categoria 2,VIA AUGUSTO AUBRY,1,Mun. I,Casa Vacanze NON impr (Appartamento),1.0,,,,...,,,,,,,,,,
3,casa e appartamento per vacanze Valentina's house,Categoria 2,VIALE GIULIO CESARE,51/A,Mun. I,Casa Vacanze NON impr (Appartamento),,2.0,,,...,,,,,,,,,,
4,OLD CITY TESTACCIO CASA E APPARTAMENTO PER VAC...,Categoria 2,VIA ORAZIO ANTINORI,2,Mun. I,Casa Vacanze NON impr (Appartamento),1.0,1.0,,,...,,,,,,,,,,


In [138]:
dataset[:3] # Look at the first 3 rows

Unnamed: 0,Insegna,Classificazione,Indirizzo,Municipio,Tipologia,Singole,Doppie,Triple,Quadruple,Quintuple,...,Telefono,Fax,Cellulare,Contatto Facebook,Contatto Twitter,Contatto Instagram,Contatto Altro Social,Unnamed: 22,Unnamed: 23,Unnamed: 24
0,"Casa e Appartamento per Vacanze ""Trastevere_ho...",Categoria 2,VIA FEDERICO ROSAZZA,52,Mun. XII,Casa Vacanze NON impr (Appartamento),1.0,,,,...,,,,,,,,,,
1,"Casa e Appartamento per Vacanze ""VIVES 63""",Categoria 1,VIA DI MONTE VERDE,63,Mun. XII,Casa Vacanze NON impr (Appartamento),,1.0,,,...,,,,,,,,,,
2,"CASA E APPARTAMENTO PER VACANZE ""LA CASA DI FA...",Categoria 2,VIA AUGUSTO AUBRY,1,Mun. I,Casa Vacanze NON impr (Appartamento),1.0,,,,...,,,,,,,,,,


In [139]:
dataset['Municipio'] # Select a column

0        52
1        63
2         1
3      51/A
4         2
5        13
6       107
7        78
8        95
9       203
10       56
11       11
12       59
13       13
14      186
15       26
16       23
17       12
18        9
19      193
Name: Municipio, dtype: object

In [140]:
dataset.index

RangeIndex(start=0, stop=20, step=1)

In [141]:
dataset.columns

Index(['Insegna', 'Classificazione', 'Indirizzo', 'Municipio', 'Tipologia',
       'Singole', 'Doppie', 'Triple', 'Quadruple', 'Quintuple', 'Sestuple',
       'Unitaâ Abitative', 'Posti Letto Unitaâ Abitative', 'Sito Web',
       'Email', 'Telefono', 'Fax', 'Cellulare', 'Contatto Facebook',
       'Contatto Twitter', 'Contatto Instagram', 'Contatto Altro Social',
       'Unnamed: 22', 'Unnamed: 23', 'Unnamed: 24'],
      dtype='object')

In [142]:
dataset.sort_values('Tipologia')

Unnamed: 0,Insegna,Classificazione,Indirizzo,Municipio,Tipologia,Singole,Doppie,Triple,Quadruple,Quintuple,...,Telefono,Fax,Cellulare,Contatto Facebook,Contatto Twitter,Contatto Instagram,Contatto Altro Social,Unnamed: 22,Unnamed: 23,Unnamed: 24
19,CASA E APPARTAMENTO PER VACANZE VATICAN MUSEUM...,Categoria 2,VIALE GIULIO CESARE,193,Mun. I,Casa Vacanze NON impr (Appartamento),2.0,1.0,,1.0,...,,,,,,,,,,
17,casa e Appartamento per Vacanze TOLEMAIDE 12 H...,Categoria 2,VIA TOLEMAIDE,12,Mun. I,Casa Vacanze NON impr (Appartamento),,1.0,1.0,,...,,,,,,,,,,
2,"CASA E APPARTAMENTO PER VACANZE ""LA CASA DI FA...",Categoria 2,VIA AUGUSTO AUBRY,1,Mun. I,Casa Vacanze NON impr (Appartamento),1.0,,,,...,,,,,,,,,,
3,casa e appartamento per vacanze Valentina's house,Categoria 2,VIALE GIULIO CESARE,51/A,Mun. I,Casa Vacanze NON impr (Appartamento),,2.0,,,...,,,,,,,,,,
4,OLD CITY TESTACCIO CASA E APPARTAMENTO PER VAC...,Categoria 2,VIA ORAZIO ANTINORI,2,Mun. I,Casa Vacanze NON impr (Appartamento),1.0,1.0,,,...,,,,,,,,,,
15,Casa e Appartamento per Vacanze VATICAN SIGHTS...,Categoria 2,PIAZZA DEI PRATI DEGLI STROZZI,26,Mun. I,Casa Vacanze NON impr (Appartamento),,2.0,,,...,,,,,,,,,,
12,"""EX- STUDIO ROMA"" CASA E APPARTAMENTO PER VACANZE",Categoria 1,VIA MACHIAVELLI,59,Mun. I,Casa Vacanze NON impr (Appartamento),,2.0,,,...,,,,,,,,,,
7,CASA E APPARTAMENTO PER VACANZE UNA NOTTE AI M...,Categoria 2,VIA SEBASTIANO VENIERO,78,Mun. I,Casa Vacanze NON impr (Appartamento),,,1.0,,...,,,,,,,,,,
11,CASA E APPARTAMENTO PER VACANZE AQPENTAHOUSE,Categoria 1,VIA DI S. ANNA,11,Mun. I,Casa Vacanze NON impr (Appartamento),,2.0,,,...,,,,,,,,,,
18,Casa e Appartamento per Vacanze ESQUILINO IX,Categoria 2,VIA BIXIO,9,Mun. I,Casa Vacanze NON impr (Appartamento),1.0,,,,...,,,,,,,,,,


In [143]:
dataset['Fax'].isnull()

0     True
1     True
2     True
3     True
4     True
5     True
6     True
7     True
8     True
9     True
10    True
11    True
12    True
13    True
14    True
15    True
16    True
17    True
18    True
19    True
Name: Fax, dtype: bool

In [144]:
dataset.drop(columns=['Fax'])

Unnamed: 0,Insegna,Classificazione,Indirizzo,Municipio,Tipologia,Singole,Doppie,Triple,Quadruple,Quintuple,...,Email,Telefono,Cellulare,Contatto Facebook,Contatto Twitter,Contatto Instagram,Contatto Altro Social,Unnamed: 22,Unnamed: 23,Unnamed: 24
0,"Casa e Appartamento per Vacanze ""Trastevere_ho...",Categoria 2,VIA FEDERICO ROSAZZA,52,Mun. XII,Casa Vacanze NON impr (Appartamento),1.0,,,,...,,,,,,,,,,
1,"Casa e Appartamento per Vacanze ""VIVES 63""",Categoria 1,VIA DI MONTE VERDE,63,Mun. XII,Casa Vacanze NON impr (Appartamento),,1.0,,,...,,,,,,,,,,
2,"CASA E APPARTAMENTO PER VACANZE ""LA CASA DI FA...",Categoria 2,VIA AUGUSTO AUBRY,1,Mun. I,Casa Vacanze NON impr (Appartamento),1.0,,,,...,,,,,,,,,,
3,casa e appartamento per vacanze Valentina's house,Categoria 2,VIALE GIULIO CESARE,51/A,Mun. I,Casa Vacanze NON impr (Appartamento),,2.0,,,...,,,,,,,,,,
4,OLD CITY TESTACCIO CASA E APPARTAMENTO PER VAC...,Categoria 2,VIA ORAZIO ANTINORI,2,Mun. I,Casa Vacanze NON impr (Appartamento),1.0,1.0,,,...,,,,,,,,,,
5,"CASA E APPARTAMENTO PER VACANZE""IDEAL PLACE""",Categoria 2,VIA GINO CAPPONI,13,Mun. VII,Casa Vacanze NON impr (Appartamento),,1.0,,,...,,,,,,,,,,
6,Casa e Appartamento per Vacanze E45,Categoria 1,VIA PORTUENSE,107,Mun. XII,Casa Vacanze NON impr (Appartamento),,1.0,,,...,,,,,,,,,,
7,CASA E APPARTAMENTO PER VACANZE UNA NOTTE AI M...,Categoria 2,VIA SEBASTIANO VENIERO,78,Mun. I,Casa Vacanze NON impr (Appartamento),,,1.0,,...,,,,,,,,,,
8,TIMPERI HOUSE,Categoria 2,VIA GASPARE GOZZI,95,Mun. VIII,Casa Vacanze NON impr (Appartamento),,1.0,,,...,,,,,,,,,,
9,LA TERRAZZA DI ISABEL,Categoria 1,VIA FLAMINIA,203,Mun. II,Casa Vacanze NON impr (Appartamento),1.0,2.0,,,...,,,,,,,,,,


In [145]:
dataset

Unnamed: 0,Insegna,Classificazione,Indirizzo,Municipio,Tipologia,Singole,Doppie,Triple,Quadruple,Quintuple,...,Telefono,Fax,Cellulare,Contatto Facebook,Contatto Twitter,Contatto Instagram,Contatto Altro Social,Unnamed: 22,Unnamed: 23,Unnamed: 24
0,"Casa e Appartamento per Vacanze ""Trastevere_ho...",Categoria 2,VIA FEDERICO ROSAZZA,52,Mun. XII,Casa Vacanze NON impr (Appartamento),1.0,,,,...,,,,,,,,,,
1,"Casa e Appartamento per Vacanze ""VIVES 63""",Categoria 1,VIA DI MONTE VERDE,63,Mun. XII,Casa Vacanze NON impr (Appartamento),,1.0,,,...,,,,,,,,,,
2,"CASA E APPARTAMENTO PER VACANZE ""LA CASA DI FA...",Categoria 2,VIA AUGUSTO AUBRY,1,Mun. I,Casa Vacanze NON impr (Appartamento),1.0,,,,...,,,,,,,,,,
3,casa e appartamento per vacanze Valentina's house,Categoria 2,VIALE GIULIO CESARE,51/A,Mun. I,Casa Vacanze NON impr (Appartamento),,2.0,,,...,,,,,,,,,,
4,OLD CITY TESTACCIO CASA E APPARTAMENTO PER VAC...,Categoria 2,VIA ORAZIO ANTINORI,2,Mun. I,Casa Vacanze NON impr (Appartamento),1.0,1.0,,,...,,,,,,,,,,
5,"CASA E APPARTAMENTO PER VACANZE""IDEAL PLACE""",Categoria 2,VIA GINO CAPPONI,13,Mun. VII,Casa Vacanze NON impr (Appartamento),,1.0,,,...,,,,,,,,,,
6,Casa e Appartamento per Vacanze E45,Categoria 1,VIA PORTUENSE,107,Mun. XII,Casa Vacanze NON impr (Appartamento),,1.0,,,...,,,,,,,,,,
7,CASA E APPARTAMENTO PER VACANZE UNA NOTTE AI M...,Categoria 2,VIA SEBASTIANO VENIERO,78,Mun. I,Casa Vacanze NON impr (Appartamento),,,1.0,,...,,,,,,,,,,
8,TIMPERI HOUSE,Categoria 2,VIA GASPARE GOZZI,95,Mun. VIII,Casa Vacanze NON impr (Appartamento),,1.0,,,...,,,,,,,,,,
9,LA TERRAZZA DI ISABEL,Categoria 1,VIA FLAMINIA,203,Mun. II,Casa Vacanze NON impr (Appartamento),1.0,2.0,,,...,,,,,,,,,,


In [146]:
dataset['Quadruple'].isnull()

0      True
1      True
2      True
3      True
4      True
5      True
6      True
7     False
8      True
9      True
10     True
11     True
12     True
13     True
14     True
15     True
16    False
17    False
18     True
19     True
Name: Quadruple, dtype: bool

In [147]:
dataset['Quadruple'].fillna(0)

0     0.0
1     0.0
2     0.0
3     0.0
4     0.0
5     0.0
6     0.0
7     1.0
8     0.0
9     0.0
10    0.0
11    0.0
12    0.0
13    0.0
14    0.0
15    0.0
16    1.0
17    1.0
18    0.0
19    0.0
Name: Quadruple, dtype: float64

In [148]:
dataset['Quadruple'].dropna()

7     1.0
16    1.0
17    1.0
Name: Quadruple, dtype: float64

In [149]:
dataset['Quadruple'].sum()

3.0

In [150]:
dataset['Unitaâ Abitative'].max()

8

In [151]:
dataset['Unitaâ Abitative'].idxmax()

19

In [152]:
dataset.loc[19]

Insegna                           CASA E APPARTAMENTO PER VACANZE VATICAN MUSEUM...
Classificazione                                                         Categoria 2
Indirizzo                                                       VIALE GIULIO CESARE
Municipio                                                                       193
Tipologia                                                                   Mun. I 
Singole                                        Casa Vacanze NON impr (Appartamento)
Doppie                                                                            2
Triple                                                                            1
Quadruple                                                                       NaN
Quintuple                                                                         1
Sestuple                                                                        NaN
Unitaâ Abitative                                                          