# Before we start

This tutorial on **Unsupervised learning** assumes some Python programming experience, including familiarity with `numpy` and `pandas`, as well as basic understanding of `sklearn`.

So, before we move on, let's quickly check these assumptions.

#### Check the Environment

First thing to check is that you have your environment ready and correctly set up.
To do so, please execute the following cell, and verify that you can import 
the main core packages on which this tutorial is built on.

**Note**: Specific extra packages will be installed whenever needed.

In [1]:
import numpy as np  # this is a convention
import scipy as sp
import pandas as pd
import sklearn

version = lambda p: print(f"{p.__name__} {p.__version__}")

version(sklearn)  # Important: sklearn version >= 0.22
version(np)
version(sp)
version(pd)  # Important: pandas >= 1.0

sklearn 0.23.2
numpy 1.19.1
scipy 1.5.2
pandas 1.0.4


#### Exercises and Questions

Quick and simple exercises to be executed directly in the notebook.

##### 1. Guess the data type

In [2]:
l = [1, 3, 5, 7, 9]

# What is the type of l ?
# Try to guess it first, then use type(l) to check your answer

In [3]:
a = np.array([1, 3, 5, 7, 9])

# What is the type of a ?
# Try to guess it first, then use type(a) to check your answer

##### 2. Spot the Difference

If we print `l` and `a`

In [4]:
print(l)

[1, 3, 5, 7, 9]


In [5]:
print(a)

[1 3 5 7 9]


they look very much the same.
Is it really so? 

What are the differences (if any) between `l` and `a`?

##### 3. Check and Reply

In [6]:
L = [[1, 2, 3], 
     [4, 5, 6], 
     [7, 8, 9]]

In [7]:
M = np.asarray(L)

Will the following instructions work (`Yes` or `No`) ?

```python
>>> L[0, 1]
>>> M[0, 1]
```

```python
>>> L[1]
>>> M[1]
```

**Q**: What will returned by these last two expressions (if any) ?

##### 4. Guess the data type (cont.)

In [8]:
from sklearn.datasets import load_iris

In [9]:
iris = load_iris(return_X_y=True)

# What is the type of iris ?
# Try to guess it first, then use type(iris) to check your answer

In [10]:
iris = load_iris()
X, y = iris.data, iris.target

# What is the type of X and y ?
# Try to guess it first, then use type(X) and/or type(y) to check your answers.

In [11]:
iris = load_iris(as_frame=True)
X, y = iris.data, iris.target

# What is the type of X and y ?
# Try to guess it first, then use type(X) and/or type(y) to check your answers.

**Q**: By the way, have you ever wondered which is the type of the `iris` object as returned by the `load_iris()` method?

Try to guess it first (*although this might be difficult, `ed.`*) , and then use `type(iris)` to check your answer.

`Hint`: The following instructions actually works

In [14]:
iris.keys()

dict_keys(['data', 'target', 'frame', 'target_names', 'DESCR', 'feature_names', 'filename'])

##### 5. Warm-up Playground

Let's play a little bit with data

Print the `shape` of `X`

In [15]:
# your code here..

Print the `shape` of `y`

In [16]:
# your code here..

Check how many different classes do we have in the Iris dataset

In [17]:
# your code here..

Select all the samples that belongs to class `setosa`

`Hint`: 

* Use the `target_names` to know which is the label associated to samples in `y`; 
* Select the label index corresponding to _setosa_
* You can use a `boolean` expression to select all the items into an array. Only those items satisfying that condition (`cond(i) == True`) will be returned.

In [None]:
# your code here..

#### Well Done 🎉

You completed the warm-up section. Now we are ready to start.