<a href="https://colab.research.google.com/github/cdcmx2020a/groupA_Jose/blob/master/Intro.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction and basics of the Task
## Today we are going to use some libraries to make things easier ...


*   numpy
*   matplotlib
*   scipy
*   plotly



![texto alternativo](https://miro.medium.com/max/432/1*jTW7doI_cqC_p9XQrmuu9A.png)

## In this example, the graph consists of 4 points linked by arrows. We call the points "nodes" and the arrows "edges".

Here we will build some graph using random positions and they will be connected based on the proximity of their neighbors.

![texto alternativo](https://github.com/napoles-uach/figuras/blob/master/nn1.png?raw=true)





---



---



# **`Numpy `** is perhaps the most important library we will use. Once imported, we are able to do numerical operations. 

In [1]:
import numpy as np

## Now, Python knows what is $\pi$, and $\cos(\pi/4)$:

In [2]:
pi= np.pi
copi= np.cos(pi/4)

print(pi,copi)

3.141592653589793 0.7071067811865476


If you want something more fancy:

In [3]:
print("pi = ",pi, "cos(pi/4) = ",copi)

pi =  3.141592653589793 cos(pi/4) =  0.7071067811865476


well, more or less ... 

# How do you evaluate the distance between two points?

R: yes, **Pythagorean theorem**.

![texto alternativo](https://4.bp.blogspot.com/-aYH9XG-HEMo/WlEtO_gzW7I/AAAAAAAAACo/eYa9Gym9Kck9lWZO9WTO-dyx2_cX5DKVQCLcBGAs/s1600/perimetro-de-un-triangulo.jpg)

Now, with python:

In [4]:
from scipy.spatial import distance
a1 = (0, 0)
a2 = (3, 4)
d = distance.euclidean(a1, a2)
print("Euclidean distance: ",d)

Euclidean distance:  5.0


#Now, doing this for many point is not practical, so we can use another function called `cdist`

In [5]:
from scipy.spatial.distance import cdist

# But now we need those points. Here we can create these coordinates using random numbers!!!

![texto alternativo](https://spicyyoghurt.com/img/tutorials/headers/large/random_header.jpg)

In [6]:
np.random.seed(1973)
N=10
longitud = np.random.uniform(-100,-120, size=(N, ))
latitud = np.random.uniform(20,30, size=(N, ))


### Print longitud and latitud to see how they look. Feel free to change the parameters to see what happens.

In [7]:
print("longitud: ", longitud)

longitud:  [-109.6699348  -105.6176406  -113.13509911 -102.3467356  -110.31682821
 -114.48555437 -112.87887082 -117.7051661  -117.59489811 -106.18942879]


# Now we can see them on the plane using `plotly`.

In [8]:
import plotly.express as px
fig = px.scatter(x=latitud,y=longitud)
fig.show()

In [9]:
xy=np.array(list(zip(longitud,latitud)))

In [10]:
xy

array([[-109.6699348 ,   22.82254546],
       [-105.6176406 ,   21.87396039],
       [-113.13509911,   21.08800552],
       [-102.3467356 ,   24.27418792],
       [-110.31682821,   25.32547259],
       [-114.48555437,   26.19645719],
       [-112.87887082,   22.3290682 ],
       [-117.7051661 ,   20.25495623],
       [-117.59489811,   22.44989631],
       [-106.18942879,   25.28597205]])

In [11]:
distancias = cdist(xy, xy, 'euclidean')
#np.where( Y==np.min(Y[np.nonzero(Y)]))
distancias

array([[ 0.        ,  4.16183876,  3.87504742,  7.46568901,  2.5851722 ,
         5.8799211 ,  3.24665831,  8.43548793,  7.93371986,  4.26408169],
       [ 4.16183876,  0.        ,  7.55843287,  4.05708168,  5.83054892,
         9.86528627,  7.2754785 , 12.1954683 , 11.99109668,  3.45959034],
       [ 3.87504742,  7.55843287,  0.        , 11.24902421,  5.08908421,
         5.28393867,  1.26723696,  4.64537226,  4.66310557,  8.11574145],
       [ 7.46568901,  4.05708168, 11.24902421,  0.        ,  8.03912779,
        12.29007893, 10.7102457 , 15.87562946, 15.35690398,  3.97366306],
       [ 2.5851722 ,  5.83054892,  5.08908421,  8.03912779,  0.        ,
         4.258743  ,  3.94239795,  8.96089688,  7.82555048,  4.12758843],
       [ 5.8799211 ,  9.86528627,  5.28393867, 12.29007893,  4.258743  ,
         0.        ,  4.18785502,  6.75776097,  4.86875106,  8.3459381 ],
       [ 3.24665831,  7.2754785 ,  1.26723696, 10.7102457 ,  3.94239795,
         4.18785502,  0.        ,  5.25310067

![texto alternativo](https://miro.medium.com/max/748/1*wP8ubuQEIrtxtfd-DTOTig.jpeg)

In [13]:
import pandas as pd

In [14]:
distancias_df=pd.DataFrame(data=distancias)

In [15]:
distancias_df

Unnamed: 0,0,1,2,3,4,5,6,7,8,9
0,0.0,4.161839,3.875047,7.465689,2.585172,5.879921,3.246658,8.435488,7.93372,4.264082
1,4.161839,0.0,7.558433,4.057082,5.830549,9.865286,7.275478,12.195468,11.991097,3.45959
2,3.875047,7.558433,0.0,11.249024,5.089084,5.283939,1.267237,4.645372,4.663106,8.115741
3,7.465689,4.057082,11.249024,0.0,8.039128,12.290079,10.710246,15.875629,15.356904,3.973663
4,2.585172,5.830549,5.089084,8.039128,0.0,4.258743,3.942398,8.960897,7.82555,4.127588
5,5.879921,9.865286,5.283939,12.290079,4.258743,0.0,4.187855,6.757761,4.868751,8.345938
6,3.246658,7.275478,1.267237,10.710246,3.942398,4.187855,0.0,5.253101,4.717575,7.313817
7,8.435488,12.195468,4.645372,15.875629,8.960897,6.757761,5.253101,0.0,2.197708,12.566755
8,7.93372,11.991097,4.663106,15.356904,7.82555,4.868751,4.717575,2.197708,0.0,11.752789
9,4.264082,3.45959,8.115741,3.973663,4.127588,8.345938,7.313817,12.566755,11.752789,0.0


In [16]:
def vecinos(df,r,i):
  return df[df[i] < r][i]

# we can get the distances to a given point using `vecinos`

In [17]:
radio=5
punto=4
vec_serie=vecinos(distancias_df,radio,punto)
print(vec_serie)

0    2.585172
4    0.000000
5    4.258743
6    3.942398
9    4.127588
Name: 4, dtype: float64


# An finally, we get the index of those neighbours as a list

In [18]:
list(vec_serie.index.values) 

[0, 4, 5, 6, 9]

# Homework: find a way to manage all pairs of nearest neighbours, for example, node 4 forms the pairs (4,0), (4,4), (4,5), (4,6), (4,9).

Of course we have to drop the pair (4,4)