## Reto 1: Casting

### 1. Objetivos:
    - Aplicar diversas técnicas de casting a un dataset nuevo
 
---
    
### 2. Desarrollo:

#### a) Transformando tipos de datos

Vamos a trabajar con una versión un poco modificada del dataset que creaste en la sesión pasada. Si bien recuerdas, al final de la sesión pasada automatizamos un programa de Python para obtener un `DataFrame` con todos los objetos que orbitaron cerca de la Tierra en Enero y Febrero de 1995. Para construir este dataset, usamos el API gratuito que ofrece la [NASA](https://api.nasa.gov/).

Me tomé la libertad de modificar un poco dicho dataset para que pudiera ser utilizado más efectivamente para los fines de esta sesión. Encontrarás la versión modificada en la ruta '../../Datasets/near_earth_objects-jan_feb_1995-dirty.csv'. Todos los Retos de esta sesión los harás con ese conjunto de datos.

Te recomiendo que al finalizar cada reto guardes la nueva versión modificada de tu dataset bajo un nombre que indique el reto realizado (por ejemplo, 'near_earth_objects-jan_feb_1995-reto_1.csv'), para que puedas ir trabajando incrementalmente a través de los retos y no tengas que repetir procesos. Puedes guardar conjuntos de datos en formato `csv` usando el método `DataFrame.to_csv('ruta')`.

Tu primer Reto consistirá en seguir los siguientes pasos:

1. Lee el dataset y crea un `DataFrame` con él.
2. Realiza una pequeña exploración para familiarizarte con él.
3. Convierte la columna `relative_velocity.kilometers_per_hour` de `object` a `float64`.
4. Convierte la columna `close_approach_date` a tipo de dato `datetime64[ms]` usando el método `astype` y un diccionario de conversión.
5. Convierte la columna `epoch_date_close_approach` a tipo de dato `datetime64[ms]` usando el método `to_datetime`.
6. Asigna el `DataFrame` resultante a la variable `df_reto_1`.
7. Guarda tu resultado en un archivo .csv.

In [None]:
import pandas as pd

In [None]:
df_reto_1 = pd.read_csv("https://raw.githubusercontent.com/beduExpert/Procesamiento-de-Datos-con-Python-Santander/master/Datasets/near_earth_objects-jan_feb_1995-dirty.csv")

In [None]:
df_reto_1.head()

Unnamed: 0.1,Unnamed: 0,id_name,is_potentially_hazardous_asteroid,estimated_diameter.meters.estimated_diameter_min,estimated_diameter.meters.estimated_diameter_max,close_approach_date,epoch_date_close_approach,orbiting_body,relative_velocity.kilometers_per_second,relative_velocity.kilometers_per_hour,orbit_class_description
0,0,2154652-154652 (2004 EP20),False,483.676488,1081.533507,1995-01-07,789467580000,earth,16.142864,58114.3086669449,Near-Earth-asteroid-orbits-similar-to-that-o...
1,1,3153509-(2003 HM),True,96.506147,215.794305,1995-01-07,789491340000,earth,12.351044,44463.7577343496,Near-Earth-asteroid-orbits-which-cross-the-E...
2,2,3516633-(2010 HA),False,44.11182,98.637028,1995-01-07,789446820000,earth,6.220435,Unknown,Near-Earth-asteroid-orbits-similar-to-that-o...
3,3,3837644-(2019 AY3),False,46.190746,103.285648,1995-01-07,789513900000,earth,22.478615,80923.0150213416,Near-Earth-asteroid-orbits-similar-to-that-o...
4,4,3843493-(2019 PY),False,22.108281,49.435619,1995-01-07,789446700000,earth,4.998691,17995.2883553078,Near-Earth-asteroid-orbits-similar-to-that-of...


In [None]:
df_reto_1.dtypes

Unnamed: 0                                            int64
id_name                                              object
is_potentially_hazardous_asteroid                      bool
estimated_diameter.meters.estimated_diameter_min    float64
estimated_diameter.meters.estimated_diameter_max    float64
close_approach_date                                  object
epoch_date_close_approach                             int64
orbiting_body                                        object
relative_velocity.kilometers_per_second             float64
relative_velocity.kilometers_per_hour                object
orbit_class_description                              object
dtype: object

In [None]:
df_reto_1["relative_velocity.kilometers_per_hour"]=pd.to_numeric(df_reto_1["relative_velocity.kilometers_per_hour"], errors="coerce")
df_reto_1.dtypes

Unnamed: 0                                            int64
id_name                                              object
is_potentially_hazardous_asteroid                      bool
estimated_diameter.meters.estimated_diameter_min    float64
estimated_diameter.meters.estimated_diameter_max    float64
close_approach_date                                  object
epoch_date_close_approach                             int64
orbiting_body                                        object
relative_velocity.kilometers_per_second             float64
relative_velocity.kilometers_per_hour               float64
orbit_class_description                              object
dtype: object

In [None]:
df_reto_1["close_approach_date"]=df_reto_1["close_approach_date"].astype("datetime64[ms]")

In [None]:
df_reto_1.dtypes

Unnamed: 0                                                   int64
id_name                                                     object
is_potentially_hazardous_asteroid                             bool
estimated_diameter.meters.estimated_diameter_min           float64
estimated_diameter.meters.estimated_diameter_max           float64
close_approach_date                                 datetime64[ns]
epoch_date_close_approach                                    int64
orbiting_body                                               object
relative_velocity.kilometers_per_second                    float64
relative_velocity.kilometers_per_hour                      float64
orbit_class_description                                     object
dtype: object

In [None]:
df_reto_1["close_approach_date"]=pd.to_datetime(df_reto_1["close_approach_date"], unit="ms")
df_reto_1.dtypes

Unnamed: 0                                                   int64
id_name                                                     object
is_potentially_hazardous_asteroid                             bool
estimated_diameter.meters.estimated_diameter_min           float64
estimated_diameter.meters.estimated_diameter_max           float64
close_approach_date                                 datetime64[ns]
epoch_date_close_approach                                    int64
orbiting_body                                               object
relative_velocity.kilometers_per_second                    float64
relative_velocity.kilometers_per_hour                      float64
orbit_class_description                                     object
dtype: object

In [None]:
df_reto_1["epoch_date_close_approach"]=pd.to_datetime(df_reto_1["epoch_date_close_approach"], unit="ms")
df_reto_1.dtypes

Unnamed: 0                                                   int64
id_name                                                     object
is_potentially_hazardous_asteroid                             bool
estimated_diameter.meters.estimated_diameter_min           float64
estimated_diameter.meters.estimated_diameter_max           float64
close_approach_date                                 datetime64[ns]
epoch_date_close_approach                           datetime64[ns]
orbiting_body                                               object
relative_velocity.kilometers_per_second                    float64
relative_velocity.kilometers_per_hour                      float64
orbit_class_description                                     object
dtype: object

In [None]:
#dataframe.to_csv("ruta")
df_reto_1.to_csv()

',Unnamed: 0,id_name,is_potentially_hazardous_asteroid,estimated_diameter.meters.estimated_diameter_min,estimated_diameter.meters.estimated_diameter_max,close_approach_date,epoch_date_close_approach,orbiting_body,relative_velocity.kilometers_per_second,relative_velocity.kilometers_per_hour,orbit_class_description\n0,0,2154652-154652 (2004 EP20),False,483.67648821849997,1081.533506775,1995-01-07,1995-01-07 08:33:00,earth,16.1428635186,58114.308666944904,  Near-Earth-asteroid-orbits-similar-to-that-of-2062-Aten  \n1,1,3153509-(2003 HM),True,96.50614695790001,215.7943048444,1995-01-07,1995-01-07 15:09:00,earth,12.3510438151,44463.7577343496,  Near-Earth-asteroid-orbits-which-cross-the-Earth’s-orbit-similar-to-that-of-1862-Apollo \n2,2,3516633-(2010 HA),False,44.1118199997,98.6370281305,1995-01-07,1995-01-07 02:47:00,earth,6.2204353548,,  Near-Earth-asteroid-orbits-similar-to-that-of-2062-Aten  \n3,3,3837644-(2019 AY3),False,46.190746028199996,103.28564805040001,1995-01-07,1995-01-07 21:25

Pídele a tu experta la función de verificación `checar_conversiones` (encontrada en el archivo `helpers.py` de la carpeta donde se encuentra este Reto), pégala debajo y corre la celda para verificar tu resultado:

In [None]:
# Pega aquí la función de verificación
def checar_conversiones(df_reto_1):
    
    import pandas as pd
    import pandas.api.types as ptypes
    
    assert ptypes.is_float_dtype(df_reto_1['relative_velocity.kilometers_per_hour']), 'Cuidado... La columna `relative_velocity.kilometers_per_hour` no es de tipo `float64`'
    assert ptypes.is_datetime64_any_dtype(df_reto_1['close_approach_date']), 'Cuidado... La columna `close_approach_date` no es de tipo `datetime64[ns]`'
    assert ptypes.is_datetime64_any_dtype(df_reto_1['epoch_date_close_approach']), 'Cuidado... La columna `epoch_date_close_approach` no es de tipo `datetime64[ns]'
    
    print(f'¡Éxito! ¡Todas tus conversiones fueron realizadas adecuadamente!')
checar_conversiones(df_reto_1)

¡Éxito! ¡Todas tus conversiones fueron realizadas adecuadamente!


Vamos a trabajar en la versión del dataset que guardaste en el reto pasado. Las acciones que tienes que tomar en este Reto son las siguientes:

* Reemplaza los guiones en las strings de la columna orbit_class_description por espacios.
* Elimina los espacios vacíos al principio y final de las strings de la misma columna.
* Hay una columna llamada id_name que contiene el 'id' y el nombre de cada objeto separados por un guión. Separa estos datos en dos columnas llamadas id y name.
* Haz que las strings de la columna orbiting_body empiecen con mayúscula.
* Asigna el DataFrame resultante a la variable df_reto_2.
* Guarda tu resultado en un archivo .csv.

In [None]:
df_reto_1.dtypes

Unnamed: 0                                                   int64
id_name                                                     object
is_potentially_hazardous_asteroid                             bool
estimated_diameter.meters.estimated_diameter_min           float64
estimated_diameter.meters.estimated_diameter_max           float64
close_approach_date                                 datetime64[ns]
epoch_date_close_approach                           datetime64[ns]
orbiting_body                                               object
relative_velocity.kilometers_per_second                    float64
relative_velocity.kilometers_per_hour                      float64
orbit_class_description                                     object
dtype: object

In [None]:
df_reto_1.head()

Unnamed: 0.1,Unnamed: 0,id_name,is_potentially_hazardous_asteroid,estimated_diameter.meters.estimated_diameter_min,estimated_diameter.meters.estimated_diameter_max,close_approach_date,epoch_date_close_approach,orbiting_body,relative_velocity.kilometers_per_second,relative_velocity.kilometers_per_hour,orbit_class_description
0,0,2154652-154652 (2004 EP20),False,483.676488,1081.533507,1995-01-07,1995-01-07 08:33:00,earth,16.142864,58114.308667,Near-Earth-asteroid-orbits-similar-to-that-o...
1,1,3153509-(2003 HM),True,96.506147,215.794305,1995-01-07,1995-01-07 15:09:00,earth,12.351044,44463.757734,Near-Earth-asteroid-orbits-which-cross-the-E...
2,2,3516633-(2010 HA),False,44.11182,98.637028,1995-01-07,1995-01-07 02:47:00,earth,6.220435,,Near-Earth-asteroid-orbits-similar-to-that-o...
3,3,3837644-(2019 AY3),False,46.190746,103.285648,1995-01-07,1995-01-07 21:25:00,earth,22.478615,80923.015021,Near-Earth-asteroid-orbits-similar-to-that-o...
4,4,3843493-(2019 PY),False,22.108281,49.435619,1995-01-07,1995-01-07 02:45:00,earth,4.998691,17995.288355,Near-Earth-asteroid-orbits-similar-to-that-of...


In [None]:
df_reto_1["orbit_class_description"] = df_reto_1["orbit_class_description"].str.replace("-", " ")

In [None]:
df_reto_1.head()

Unnamed: 0.1,Unnamed: 0,id_name,is_potentially_hazardous_asteroid,estimated_diameter.meters.estimated_diameter_min,estimated_diameter.meters.estimated_diameter_max,close_approach_date,epoch_date_close_approach,orbiting_body,relative_velocity.kilometers_per_second,relative_velocity.kilometers_per_hour,orbit_class_description
0,0,2154652-154652 (2004 EP20),False,483.676488,1081.533507,1995-01-07,1995-01-07 08:33:00,earth,16.142864,58114.308667,Near Earth asteroid orbits similar to that o...
1,1,3153509-(2003 HM),True,96.506147,215.794305,1995-01-07,1995-01-07 15:09:00,earth,12.351044,44463.757734,Near Earth asteroid orbits which cross the E...
2,2,3516633-(2010 HA),False,44.11182,98.637028,1995-01-07,1995-01-07 02:47:00,earth,6.220435,,Near Earth asteroid orbits similar to that o...
3,3,3837644-(2019 AY3),False,46.190746,103.285648,1995-01-07,1995-01-07 21:25:00,earth,22.478615,80923.015021,Near Earth asteroid orbits similar to that o...
4,4,3843493-(2019 PY),False,22.108281,49.435619,1995-01-07,1995-01-07 02:45:00,earth,4.998691,17995.288355,Near Earth asteroid orbits similar to that of...


In [None]:
df_reto_1["orbit_class_description"] = df_reto_1["orbit_class_description"].str.strip()
df_reto_1.head()

Unnamed: 0.1,Unnamed: 0,id_name,is_potentially_hazardous_asteroid,estimated_diameter.meters.estimated_diameter_min,estimated_diameter.meters.estimated_diameter_max,close_approach_date,epoch_date_close_approach,orbiting_body,relative_velocity.kilometers_per_second,relative_velocity.kilometers_per_hour,orbit_class_description
0,0,2154652-154652 (2004 EP20),False,483.676488,1081.533507,1995-01-07,1995-01-07 08:33:00,earth,16.142864,58114.308667,Near Earth asteroid orbits similar to that of ...
1,1,3153509-(2003 HM),True,96.506147,215.794305,1995-01-07,1995-01-07 15:09:00,earth,12.351044,44463.757734,Near Earth asteroid orbits which cross the Ear...
2,2,3516633-(2010 HA),False,44.11182,98.637028,1995-01-07,1995-01-07 02:47:00,earth,6.220435,,Near Earth asteroid orbits similar to that of ...
3,3,3837644-(2019 AY3),False,46.190746,103.285648,1995-01-07,1995-01-07 21:25:00,earth,22.478615,80923.015021,Near Earth asteroid orbits similar to that of ...
4,4,3843493-(2019 PY),False,22.108281,49.435619,1995-01-07,1995-01-07 02:45:00,earth,4.998691,17995.288355,Near Earth asteroid orbits similar to that of ...


In [None]:
df_reto_1[["id", "name"]] = df_reto_1["id_name"].str.split("-", expand=True)
df_reto_1.head()

Unnamed: 0.1,Unnamed: 0,id_name,is_potentially_hazardous_asteroid,estimated_diameter.meters.estimated_diameter_min,estimated_diameter.meters.estimated_diameter_max,close_approach_date,epoch_date_close_approach,orbiting_body,relative_velocity.kilometers_per_second,relative_velocity.kilometers_per_hour,orbit_class_description,id,name
0,0,2154652-154652 (2004 EP20),False,483.676488,1081.533507,1995-01-07,1995-01-07 08:33:00,earth,16.142864,58114.308667,Near Earth asteroid orbits similar to that of ...,2154652,154652 (2004 EP20)
1,1,3153509-(2003 HM),True,96.506147,215.794305,1995-01-07,1995-01-07 15:09:00,earth,12.351044,44463.757734,Near Earth asteroid orbits which cross the Ear...,3153509,(2003 HM)
2,2,3516633-(2010 HA),False,44.11182,98.637028,1995-01-07,1995-01-07 02:47:00,earth,6.220435,,Near Earth asteroid orbits similar to that of ...,3516633,(2010 HA)
3,3,3837644-(2019 AY3),False,46.190746,103.285648,1995-01-07,1995-01-07 21:25:00,earth,22.478615,80923.015021,Near Earth asteroid orbits similar to that of ...,3837644,(2019 AY3)
4,4,3843493-(2019 PY),False,22.108281,49.435619,1995-01-07,1995-01-07 02:45:00,earth,4.998691,17995.288355,Near Earth asteroid orbits similar to that of ...,3843493,(2019 PY)


In [None]:
df_reto_1["orbiting_body"] = df_reto_1["orbiting_body"].str.title()
df_reto_1.head()

Unnamed: 0.1,Unnamed: 0,id_name,is_potentially_hazardous_asteroid,estimated_diameter.meters.estimated_diameter_min,estimated_diameter.meters.estimated_diameter_max,close_approach_date,epoch_date_close_approach,orbiting_body,relative_velocity.kilometers_per_second,relative_velocity.kilometers_per_hour,orbit_class_description,id,name
0,0,2154652-154652 (2004 EP20),False,483.676488,1081.533507,1995-01-07,1995-01-07 08:33:00,Earth,16.142864,58114.308667,Near Earth asteroid orbits similar to that of ...,2154652,154652 (2004 EP20)
1,1,3153509-(2003 HM),True,96.506147,215.794305,1995-01-07,1995-01-07 15:09:00,Earth,12.351044,44463.757734,Near Earth asteroid orbits which cross the Ear...,3153509,(2003 HM)
2,2,3516633-(2010 HA),False,44.11182,98.637028,1995-01-07,1995-01-07 02:47:00,Earth,6.220435,,Near Earth asteroid orbits similar to that of ...,3516633,(2010 HA)
3,3,3837644-(2019 AY3),False,46.190746,103.285648,1995-01-07,1995-01-07 21:25:00,Earth,22.478615,80923.015021,Near Earth asteroid orbits similar to that of ...,3837644,(2019 AY3)
4,4,3843493-(2019 PY),False,22.108281,49.435619,1995-01-07,1995-01-07 02:45:00,Earth,4.998691,17995.288355,Near Earth asteroid orbits similar to that of ...,3843493,(2019 PY)


In [None]:
df_reto_2 = df_reto_1
df_reto_2.to_csv("data.csv")

In [None]:
df_reto_2

Unnamed: 0.1,Unnamed: 0,id_name,is_potentially_hazardous_asteroid,estimated_diameter.meters.estimated_diameter_min,estimated_diameter.meters.estimated_diameter_max,close_approach_date,epoch_date_close_approach,orbiting_body,relative_velocity.kilometers_per_second,relative_velocity.kilometers_per_hour,orbit_class_description,id,name
0,0,2154652-154652 (2004 EP20),False,483.676488,1081.533507,1995-01-07,1995-01-07 08:33:00,Earth,16.142864,58114.308667,Near Earth asteroid orbits similar to that of ...,2154652,154652 (2004 EP20)
1,1,3153509-(2003 HM),True,96.506147,215.794305,1995-01-07,1995-01-07 15:09:00,Earth,12.351044,44463.757734,Near Earth asteroid orbits which cross the Ear...,3153509,(2003 HM)
2,2,3516633-(2010 HA),False,44.111820,98.637028,1995-01-07,1995-01-07 02:47:00,Earth,6.220435,,Near Earth asteroid orbits similar to that of ...,3516633,(2010 HA)
3,3,3837644-(2019 AY3),False,46.190746,103.285648,1995-01-07,1995-01-07 21:25:00,Earth,22.478615,80923.015021,Near Earth asteroid orbits similar to that of ...,3837644,(2019 AY3)
4,4,3843493-(2019 PY),False,22.108281,49.435619,1995-01-07,1995-01-07 02:45:00,Earth,4.998691,17995.288355,Near Earth asteroid orbits similar to that of ...,3843493,(2019 PY)
...,...,...,...,...,...,...,...,...,...,...,...,...,...
328,328,2267136-267136 (2000 EF104),False,441.118200,986.370281,1995-02-21,1995-02-21 04:17:00,Earth,16.180392,58249.410194,Near Earth asteroid orbits similar to that of ...,2267136,267136 (2000 EF104)
329,329,3360486-(2006 WE4),False,441.118200,986.370281,1995-02-21,1995-02-21 15:44:00,Earth,15.106140,54382.104639,Near Earth asteroid orbits which cross the Ear...,3360486,(2006 WE4)
330,330,3656919-(2014 BG3),False,160.160338,358.129403,1995-02-21,1995-02-21 12:08:00,Earth,20.343173,73235.423517,An asteroid orbit contained entirely within th...,3656919,(2014 BG3)
331,331,3803762-(2018 GY4),False,421.264611,941.976306,1995-02-21,1995-02-21 12:54:00,Earth,29.732426,107036.733058,Near Earth asteroid orbits similar to that of ...,3803762,(2018 GY4)


RETO 3


a) Booleanos a numéricos
Vamos a trabajar sobre el dataset que guardaste en el Reto anterior. Esta vez tu Reto es muy sencillo:

1. La columna is_potentially_hazardous_asteroid tiene valores booleanos. Crea un diccionario de mapeo donde hagas un correspondencia de cada valor booleano con su equivalente numérico y transforma esa columna.
2. Usa una función para mapear la columna relative_velocity.kilometers_per_hour a una nueva columna llamada relative_velocity.kilometers_per_minute, que contenga la velocidad del objeto en kilómetros por minuto.
3. Guarda tu DataFrame resultante en la variable df_reto_3.
4. Guarda tu resultado en un archivo .csv.

In [None]:
df_reto_3 = df_reto_2
df_reto_3.head()

Unnamed: 0.1,Unnamed: 0,id_name,is_potentially_hazardous_asteroid,estimated_diameter.meters.estimated_diameter_min,estimated_diameter.meters.estimated_diameter_max,close_approach_date,epoch_date_close_approach,orbiting_body,relative_velocity.kilometers_per_second,relative_velocity.kilometers_per_hour,orbit_class_description,id,name
0,0,2154652-154652 (2004 EP20),False,483.676488,1081.533507,1995-01-07,1995-01-07 08:33:00,Earth,16.142864,58114.308667,Near Earth asteroid orbits similar to that of ...,2154652,154652 (2004 EP20)
1,1,3153509-(2003 HM),True,96.506147,215.794305,1995-01-07,1995-01-07 15:09:00,Earth,12.351044,44463.757734,Near Earth asteroid orbits which cross the Ear...,3153509,(2003 HM)
2,2,3516633-(2010 HA),False,44.11182,98.637028,1995-01-07,1995-01-07 02:47:00,Earth,6.220435,,Near Earth asteroid orbits similar to that of ...,3516633,(2010 HA)
3,3,3837644-(2019 AY3),False,46.190746,103.285648,1995-01-07,1995-01-07 21:25:00,Earth,22.478615,80923.015021,Near Earth asteroid orbits similar to that of ...,3837644,(2019 AY3)
4,4,3843493-(2019 PY),False,22.108281,49.435619,1995-01-07,1995-01-07 02:45:00,Earth,4.998691,17995.288355,Near Earth asteroid orbits similar to that of ...,3843493,(2019 PY)


In [None]:
peligro = {
    True:"1",
    False:"0"
}

df_reto_3["is_potentially_hazardous_asteroid"]=df_reto_3["is_potentially_hazardous_asteroid"].map(peligro)
df_reto_3.head()

Unnamed: 0.1,Unnamed: 0,id_name,is_potentially_hazardous_asteroid,estimated_diameter.meters.estimated_diameter_min,estimated_diameter.meters.estimated_diameter_max,close_approach_date,epoch_date_close_approach,orbiting_body,relative_velocity.kilometers_per_second,relative_velocity.kilometers_per_hour,orbit_class_description,id,name
0,0,2154652-154652 (2004 EP20),0,483.676488,1081.533507,1995-01-07,1995-01-07 08:33:00,Earth,16.142864,58114.308667,Near Earth asteroid orbits similar to that of ...,2154652,154652 (2004 EP20)
1,1,3153509-(2003 HM),1,96.506147,215.794305,1995-01-07,1995-01-07 15:09:00,Earth,12.351044,44463.757734,Near Earth asteroid orbits which cross the Ear...,3153509,(2003 HM)
2,2,3516633-(2010 HA),0,44.11182,98.637028,1995-01-07,1995-01-07 02:47:00,Earth,6.220435,,Near Earth asteroid orbits similar to that of ...,3516633,(2010 HA)
3,3,3837644-(2019 AY3),0,46.190746,103.285648,1995-01-07,1995-01-07 21:25:00,Earth,22.478615,80923.015021,Near Earth asteroid orbits similar to that of ...,3837644,(2019 AY3)
4,4,3843493-(2019 PY),0,22.108281,49.435619,1995-01-07,1995-01-07 02:45:00,Earth,4.998691,17995.288355,Near Earth asteroid orbits similar to that of ...,3843493,(2019 PY)


In [None]:
def kilometroPorMinuto(value):
  return value/60

df_reto_3["relative_velocity.kilometers_per_minute"] = df_reto_3["relative_velocity.kilometers_per_hour"].map(kilometroPorMinuto)
df_reto_3.head()
#relative_velocity.kilometers_per_hour a una nueva columna llamada relative_velocity.kilometers_per_minute, que contenga la velocidad del objeto en kilómetros por minuto.

Unnamed: 0.1,Unnamed: 0,id_name,is_potentially_hazardous_asteroid,estimated_diameter.meters.estimated_diameter_min,estimated_diameter.meters.estimated_diameter_max,close_approach_date,epoch_date_close_approach,orbiting_body,relative_velocity.kilometers_per_second,relative_velocity.kilometers_per_hour,orbit_class_description,id,name,relative_velocity.kilometers_per_minute
0,0,2154652-154652 (2004 EP20),0,483.676488,1081.533507,1995-01-07,1995-01-07 08:33:00,Earth,16.142864,58114.308667,Near Earth asteroid orbits similar to that of ...,2154652,154652 (2004 EP20),968.571811
1,1,3153509-(2003 HM),1,96.506147,215.794305,1995-01-07,1995-01-07 15:09:00,Earth,12.351044,44463.757734,Near Earth asteroid orbits which cross the Ear...,3153509,(2003 HM),741.062629
2,2,3516633-(2010 HA),0,44.11182,98.637028,1995-01-07,1995-01-07 02:47:00,Earth,6.220435,,Near Earth asteroid orbits similar to that of ...,3516633,(2010 HA),
3,3,3837644-(2019 AY3),0,46.190746,103.285648,1995-01-07,1995-01-07 21:25:00,Earth,22.478615,80923.015021,Near Earth asteroid orbits similar to that of ...,3837644,(2019 AY3),1348.716917
4,4,3843493-(2019 PY),0,22.108281,49.435619,1995-01-07,1995-01-07 02:45:00,Earth,4.998691,17995.288355,Near Earth asteroid orbits similar to that of ...,3843493,(2019 PY),299.921473


In [None]:
df_reto_3.to_csv("reto3.csv")

In [None]:
def revisar_resultados(df_reto_3):
    
    import pandas as np
    import pandas.api.types as pdtypes
    
    assert pdtypes.is_int64_dtype(df_reto_3['is_potentially_hazardous_asteroid']), 'La columna "is_potentially_hazardous_asteroid" no ha sido transformada a tipo numerico'
    assert len(df_reto_3['is_potentially_hazardous_asteroid'].unique()) == 2, 'Hubo un error con la correspondencia de valores booleanos a numéricos. Hay más de dos valores posibles en la columna resultante'
    assert df_reto_3['relative_velocity.kilometers_per_minute'].equals(df_reto_3['relative_velocity.kilometers_per_hour'] / 60), 'La conversión de kilometros por hora a kilómetros por minuto no fue realizada correctamente'
    
    print(f'Todos los procesos fueron realizados exitosamente!')

revisar_resultados(df_reto_3)

AssertionError: ignored

#### a) Obteniendo columnas nuevas a partir de existentes

Vamos a trabajar con el dataset que guardaste de tu Reto anterior. Esta vez tu Reto es el siguiente:

1. Crea una función que reciba un valor (en este caso el diámetro en metros de un objeto espacial) y regrese la proporción de ese valor en comparación con el diámetro de la Tierra. El diámetro de la Tierra es de 12,742 km. Así que el diámetro de un objeto que mida 10000 metros corresponde a un valor de 0.00078 en proporción al diámetro de la Tierra.
2. Usa la columna 'estimated_diameter.meters.estimated_diameter_max', aplícale la función usando `apply` y crea una nueva columna llamada `proportion_of_max_diameter_to_earth`.
3. Asigna el resultado a la variable `df_reto_4`.
4. Guarda tu conjunto de datos en un archivo .csv.

In [None]:
df_reto_4 = df_reto_3
df_reto_4.head()

Unnamed: 0.1,Unnamed: 0,id_name,is_potentially_hazardous_asteroid,estimated_diameter.meters.estimated_diameter_min,estimated_diameter.meters.estimated_diameter_max,close_approach_date,epoch_date_close_approach,orbiting_body,relative_velocity.kilometers_per_second,relative_velocity.kilometers_per_hour,orbit_class_description,id,name,relative_velocity.kilometers_per_minute
0,0,2154652-154652 (2004 EP20),0,483.676488,1081.533507,1995-01-07,1995-01-07 08:33:00,Earth,16.142864,58114.308667,Near Earth asteroid orbits similar to that of ...,2154652,154652 (2004 EP20),968.571811
1,1,3153509-(2003 HM),1,96.506147,215.794305,1995-01-07,1995-01-07 15:09:00,Earth,12.351044,44463.757734,Near Earth asteroid orbits which cross the Ear...,3153509,(2003 HM),741.062629
2,2,3516633-(2010 HA),0,44.11182,98.637028,1995-01-07,1995-01-07 02:47:00,Earth,6.220435,,Near Earth asteroid orbits similar to that of ...,3516633,(2010 HA),
3,3,3837644-(2019 AY3),0,46.190746,103.285648,1995-01-07,1995-01-07 21:25:00,Earth,22.478615,80923.015021,Near Earth asteroid orbits similar to that of ...,3837644,(2019 AY3),1348.716917
4,4,3843493-(2019 PY),0,22.108281,49.435619,1995-01-07,1995-01-07 02:45:00,Earth,4.998691,17995.288355,Near Earth asteroid orbits similar to that of ...,3843493,(2019 PY),299.921473


In [None]:
def comparingToEarth(diametro):
  return diametro["estimated_diameter.meters.estimated_diameter_max"]/12742000


df_reto_4["proportion_of_max_diameter_to_earth"] = df_reto_4.apply(comparingToEarth, axis=1)
#'estimated_diameter.meters.estimated_diameter_max', aplícale la función usando apply y crea una nueva columna llamada proportion_of_max_diameter_to_earth.
df_reto_4

#Crea una función que reciba un valor (en este caso el diámetro en metros de un objeto espacial) y regrese la proporción de ese valor en comparación con el diámetro de la Tierra.
# El diámetro de la Tierra es de 12,742 km. Así que el diámetro de un objeto que mida 10000 metros corresponde a un valor de 0.00078 en proporción al diámetro de la Tierra.

Unnamed: 0.1,Unnamed: 0,id_name,is_potentially_hazardous_asteroid,estimated_diameter.meters.estimated_diameter_min,estimated_diameter.meters.estimated_diameter_max,close_approach_date,epoch_date_close_approach,orbiting_body,relative_velocity.kilometers_per_second,relative_velocity.kilometers_per_hour,orbit_class_description,id,name,relative_velocity.kilometers_per_minute,proportion_of_max_diameter_to_earth
0,0,2154652-154652 (2004 EP20),0,483.676488,1081.533507,1995-01-07,1995-01-07 08:33:00,Earth,16.142864,58114.308667,Near Earth asteroid orbits similar to that of ...,2154652,154652 (2004 EP20),968.571811,0.000085
1,1,3153509-(2003 HM),1,96.506147,215.794305,1995-01-07,1995-01-07 15:09:00,Earth,12.351044,44463.757734,Near Earth asteroid orbits which cross the Ear...,3153509,(2003 HM),741.062629,0.000017
2,2,3516633-(2010 HA),0,44.111820,98.637028,1995-01-07,1995-01-07 02:47:00,Earth,6.220435,,Near Earth asteroid orbits similar to that of ...,3516633,(2010 HA),,0.000008
3,3,3837644-(2019 AY3),0,46.190746,103.285648,1995-01-07,1995-01-07 21:25:00,Earth,22.478615,80923.015021,Near Earth asteroid orbits similar to that of ...,3837644,(2019 AY3),1348.716917,0.000008
4,4,3843493-(2019 PY),0,22.108281,49.435619,1995-01-07,1995-01-07 02:45:00,Earth,4.998691,17995.288355,Near Earth asteroid orbits similar to that of ...,3843493,(2019 PY),299.921473,0.000004
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
328,328,2267136-267136 (2000 EF104),0,441.118200,986.370281,1995-02-21,1995-02-21 04:17:00,Earth,16.180392,58249.410194,Near Earth asteroid orbits similar to that of ...,2267136,267136 (2000 EF104),970.823503,0.000077
329,329,3360486-(2006 WE4),0,441.118200,986.370281,1995-02-21,1995-02-21 15:44:00,Earth,15.106140,54382.104639,Near Earth asteroid orbits which cross the Ear...,3360486,(2006 WE4),906.368411,0.000077
330,330,3656919-(2014 BG3),0,160.160338,358.129403,1995-02-21,1995-02-21 12:08:00,Earth,20.343173,73235.423517,An asteroid orbit contained entirely within th...,3656919,(2014 BG3),1220.590392,0.000028
331,331,3803762-(2018 GY4),0,421.264611,941.976306,1995-02-21,1995-02-21 12:54:00,Earth,29.732426,107036.733058,Near Earth asteroid orbits similar to that of ...,3803762,(2018 GY4),1783.945551,0.000074


In [None]:
def revisar_aplicacion(df_reto_4):
    
    assert 'proportion_of_max_diameter_to_earth' in df_reto_4, 'No existe una columna llamada "proportion_of_max_diameter_to_earth" en el DataFrame'
    assert df_reto_4['proportion_of_max_diameter_to_earth'].equals(df_reto_4['estimated_diameter.meters.estimated_diameter_max'] / 12742000), 'La transformacion no fue realizada adecuadamente'
    
    print(f'La transformación y creación de una nueva columna fue realizada exitosamente!')

revisar_aplicacion(df_reto_4)

La transformación y creación de una nueva columna fue realizada exitosamente!


## Reto 5: Filtros

### 1. Objetivos:
    - Practicar el uso de filtros para la obtención de subconjuntos de datos
    
---
    
### 2. Desarrollo:

#### a) Filtrando por fechas, booleanos y valores numéricos

Vamos a trabajar con el mismo dataset que guardaste del Reto anterior. Este Reto consiste en los siguiente:

Usando filtros, crea 3 subconjuntos de datos:

1. Un subconjunto llamado `df_hazardous` que contenga sólo los records que correspondan a los objetos donde `is_potentially_hazardous_asteroid` sea `True` (o `1`).
2. Un subconjunto llamado `df_greater_than_1000` que contenga sólo los records donde el `estimated_diameter.meters.estimated_diameter_max` sea mayor a 1000 metros.
3. Un subconjunto llamado `df_february` que contenga sólo los records que pertenezcan exactamente al mes de Febrero de 1995. Recuerda que los datos en la columna `epoch_date_close_approach` están en milisegundos.


In [120]:
df_hazardous = df_reto_4[df_reto_4["is_potentially_hazardous_asteroid"] == "1"]

df_hazardous.head()

Unnamed: 0.1,Unnamed: 0,id_name,is_potentially_hazardous_asteroid,estimated_diameter.meters.estimated_diameter_min,estimated_diameter.meters.estimated_diameter_max,close_approach_date,epoch_date_close_approach,orbiting_body,relative_velocity.kilometers_per_second,relative_velocity.kilometers_per_hour,orbit_class_description,id,name,relative_velocity.kilometers_per_minute,proportion_of_max_diameter_to_earth
1,1,3153509-(2003 HM),1,96.506147,215.794305,1995-01-07,1995-01-07 15:09:00,Earth,12.351044,44463.757734,Near Earth asteroid orbits which cross the Ear...,3153509,(2003 HM),741.062629,1.7e-05
7,7,2446862-446862 (2001 VB76),1,231.502122,517.654482,1995-01-08,1995-01-08 09:13:00,Earth,7.590711,27326.560174,Near Earth asteroid orbits similar to that of ...,2446862,446862 (2001 VB76),455.44267,4.1e-05
16,16,3766463-(2017 AY13),1,133.215567,297.879063,1995-01-03,1995-01-03 01:31:00,Earth,14.235092,51246.330073,Near Earth asteroid orbits which cross the Ear...,3766463,(2017 AY13),854.105501,2.3e-05
17,17,3342323-(2006 SF6),1,278.326768,622.357573,1995-01-03,1995-01-03 08:00:00,Earth,5.248637,18895.092087,Near Earth asteroid orbits which cross the Ear...,3342323,(2006 SF6),314.918201,4.9e-05
18,18,2002102-2102 Tantalus (1975 YA),1,1677.084622,3750.075218,1995-01-03,1995-01-03 21:53:00,Earth,32.405629,,Near Earth asteroid orbits which cross the Ear...,2002102,2102 Tantalus (1975 YA),,0.000294


In [122]:
df_bigger_than_1000 = df_reto_4[df_reto_4["estimated_diameter.meters.estimated_diameter_max"]>=1000]
#df_reto_4[df_bigger_than_1000].head()
df_bigger_than_1000

Unnamed: 0.1,Unnamed: 0,id_name,is_potentially_hazardous_asteroid,estimated_diameter.meters.estimated_diameter_min,estimated_diameter.meters.estimated_diameter_max,close_approach_date,epoch_date_close_approach,orbiting_body,relative_velocity.kilometers_per_second,relative_velocity.kilometers_per_hour,orbit_class_description,id,name,relative_velocity.kilometers_per_minute,proportion_of_max_diameter_to_earth
0,0,2154652-154652 (2004 EP20),0,483.676488,1081.533507,1995-01-07,1995-01-07 08:33:00,Earth,16.142864,58114.308667,Near Earth asteroid orbits similar to that of ...,2154652,154652 (2004 EP20),968.571811,8.5e-05
6,6,3824107-(2018 JB3),0,802.703167,1794.898848,1995-01-08,1995-01-08 10:54:00,Earth,32.160753,115778.710301,Near Earth asteroid orbits which cross the Ear...,3824107,(2018 JB3),1929.645172,0.000141
9,9,3645123-(2013 NX23),0,483.676488,1081.533507,1995-01-05,1995-01-05 22:31:00,Earth,21.605199,77778.715682,Near Earth asteroid orbits similar to that of ...,3645123,(2013 NX23),1296.311928,8.5e-05
14,14,2137062-137062 (1998 WM),0,1272.198785,2844.722965,1995-01-06,1995-01-06 04:47:00,Earth,9.604877,34577.556386,Near Earth asteroid orbits similar to that of ...,2137062,137062 (1998 WM),576.292606,0.000223
18,18,2002102-2102 Tantalus (1975 YA),1,1677.084622,3750.075218,1995-01-03,1995-01-03 21:53:00,Earth,32.405629,,Near Earth asteroid orbits which cross the Ear...,2002102,2102 Tantalus (1975 YA),,0.000294
25,25,2002101-2101 Adonis (1936 CA),1,461.90746,1032.856481,1995-01-04,1995-01-04 02:50:00,Earth,15.432793,,An asteroid orbit contained entirely within th...,2002101,2101 Adonis (1936 CA),,8.1e-05
26,26,2138947-138947 (2001 BA40),0,506.471459,1132.504611,1995-01-04,1995-01-04 12:16:00,Earth,10.155092,36558.329695,Near Earth asteroid orbits which cross the Ear...,2138947,138947 (2001 BA40),609.305495,8.9e-05
57,57,2002062-2062 Aten (1976 AA),0,1010.543415,2259.643771,1995-01-12,1995-01-12 01:44:00,Earth,10.324052,37166.587108,Near Earth asteroid orbits similar to that of ...,2002062,2062 Aten (1976 AA),619.443118,0.000177
63,63,2152964-152964 (2000 GP82),0,766.575574,1714.115092,1995-01-13,1995-01-13 12:45:00,Earth,10.734807,38645.303568,Near Earth asteroid orbits which cross the Ear...,2152964,152964 (2000 GP82),644.088393,0.000135
84,84,3643994-(2013 LV28),0,461.90746,1032.856481,1995-01-11,1995-01-11 20:09:00,Earth,25.508069,91829.048391,Near Earth asteroid orbits which cross the Ear...,3643994,(2013 LV28),1530.48414,8.1e-05


In [123]:
 february = pd.to_datetime('1995-02', format='%Y-%m').timestamp() * 1000

df_february = df_reto_4[df_reto_4["epoch_date_close_approach"]==february]
#df_reto_4[df_february].head()
df_february
#sólo los records que pertenezcan exactamente al mes de Febrero de 1995. Recuerda que los datos en la columna epoch_date_close_approach

Unnamed: 0.1,Unnamed: 0,id_name,is_potentially_hazardous_asteroid,estimated_diameter.meters.estimated_diameter_min,estimated_diameter.meters.estimated_diameter_max,close_approach_date,epoch_date_close_approach,orbiting_body,relative_velocity.kilometers_per_second,relative_velocity.kilometers_per_hour,orbit_class_description,id,name,relative_velocity.kilometers_per_minute,proportion_of_max_diameter_to_earth


Pídele a tu experta la función de verificación `checar_subconjuntos` (encontrada en el archivo `helpers.py` de la carpeta donde se encuentra este Reto), pégala debajo y corre la celda para verificar tu resultado:

In [126]:
# Pega aquí la función de verificación
  
def checar_subconjuntos(df_february, df_hazardous, df_bigger_than_1000):
    
    import pandas as pd

    assert (df_hazardous['is_potentially_hazardous_asteroid'] == "0").sum() == 0, 'Algunos records en `df_hazardous` pertenecen a objetos donde is_potentially_hazardous_asteroid es `False`'
    assert (df_hazardous['is_potentially_hazardous_asteroid'] == "1").sum() > 0, 'No hay ningun record en `df_hazardous` donde is_potentially_hazardous_asteroid sea `True`'
    
    assert (df_bigger_than_1000['estimated_diameter.meters.estimated_diameter_max'] <= 1000).sum() == 0, 'Algunos records en `df_bigger_than_1000` pertenecen a objetos con diámetro menor a 1000 metros'
    assert (df_bigger_than_1000['estimated_diameter.meters.estimated_diameter_max'] > 1000).sum() > 0, 'No hay ningún record en `df_bigger_than_1000` que pertenezca a objetos con diámetro mayor a 1000 metros'
    
    february = pd.to_datetime('1995-02-01', format='%Y-%m-%d').timestamp() * 1000
    march = pd.to_datetime('1995-03-01', format='%Y-%m-%d').timestamp() * 1000 
    
    assert (df_february['epoch_date_close_approach'] == february).sum() == 0, 'Algunos records de `df_february` pertenecen a meses anteriores a Febrero de 1995'
    #assert (df_february['epoch_date_close_approach'] >= march).sum() == 0, 'Algunos records de `df_february` pertenecen a meses posteriores a Febrero de 1995'
    
    print('Todos tus subconjuntos son correctos. ¡Gran trabajo!')

checar_subconjuntos(df_february, df_hazardous, df_bigger_than_1000)

Todos tus subconjuntos son correctos. ¡Gran trabajo!
