# Procesamiento de datos con Python
## Proyecto final

Objetivos
- Obtener datos de una fuente remota
- Crear un proceso de ETL (Extracción, Transformación y Carga)
- Utilizar funciones de Python Standard Library
- Crear funciones de análisis con `filter` & `map`
- Utilizar `Jupyter Notebook`
- Entornos virtuales de Python3
- Utilizar Pandas & Matploitlib
- Usar github

Proyecto
Vivimos en epocas dificiles, COVID19 ha cambiado la manera de ver las cosas
en diferentes aspectos, las redes sociales como Twitter han capturado muchos
de los mensajes publicados por las personas alrededor del mundo.
El proyecto que trabajaras será analizar las publicaciones de usuarios de
twitter relacionadas al tema del momento.

In [2]:
# Pquetes y obtención de información

import csv
import pandas as pd
import requests
import io
from datetime import datetime

# CONSTANTS
FILENAME = 'http://galileoguzman.com/data/covid19_tweets.csv'
s=requests.get(FILENAME).content
dfcovid = pd.read_csv(io.StringIO(s.decode()))


## Limpieza de dataframe

In [3]:
dfcovid.head()

Unnamed: 0,user_name,user_location,user_description,user_created,user_followers,user_friends,user_favourites,user_verified,date,text,hashtags,source,is_retweet
0,ᏉᎥ☻լꂅϮ,astroworld,wednesday addams as a disney princess keepin i...,2017-05-26 05:46:42,624,950,18775,False,2020-07-25 12:27:21,If I smelled the scent of hand sanitizers toda...,,Twitter for iPhone,False
1,Tom Basile 🇺🇸,"New York, NY","Husband, Father, Columnist & Commentator. Auth...",2009-04-16 20:06:23,2253,1677,24,True,2020-07-25 12:27:17,Hey @Yankees @YankeesPR and @MLB - wouldn't it...,,Twitter for Android,False
2,Time4fisticuffs,"Pewee Valley, KY",#Christian #Catholic #Conservative #Reagan #Re...,2009-02-28 18:57:41,9275,9525,7254,False,2020-07-25 12:27:14,@diane3443 @wdunlap @realDonaldTrump Trump nev...,['COVID19'],Twitter for Android,False
3,ethel mertz,Stuck in the Middle,#Browns #Indians #ClevelandProud #[]_[] #Cavs ...,2019-03-07 01:45:06,197,987,1488,False,2020-07-25 12:27:10,@brookbanktv The one gift #COVID19 has give me...,['COVID19'],Twitter for iPhone,False
4,DIPR-J&K,Jammu and Kashmir,🖊️Official Twitter handle of Department of Inf...,2017-02-12 06:45:15,101009,168,101,False,2020-07-25 12:27:08,25 July : Media Bulletin on Novel #CoronaVirus...,"['CoronaVirusUpdates', 'COVID19']",Twitter for Android,False


In [5]:
dfcovid.dtypes

user_name           object
user_location       object
user_description    object
user_created        object
user_followers       int64
user_friends         int64
user_favourites      int64
user_verified         bool
date                object
text                object
hashtags            object
source              object
is_retweet            bool
dtype: object

In [15]:
casting = {
    'user_name': 'string',
    'user_location': 'string',
    'user_description': 'string',
    'user_created': 'datetime64[ns]',
    'date': 'datetime64[ns]',
    'text': 'string',
    'hashtags': 'string',
    'source': 'string',
}
df = dfcovid.astype(casting)
df.head()

Unnamed: 0,user_name,user_location,user_description,user_created,user_followers,user_friends,user_favourites,user_verified,date,text,hashtags,source,is_retweet
0,ᏉᎥ☻լꂅϮ,astroworld,wednesday addams as a disney princess keepin i...,2017-05-26 05:46:42,624,950,18775,False,2020-07-25 12:27:21,If I smelled the scent of hand sanitizers toda...,,Twitter for iPhone,False
1,Tom Basile 🇺🇸,"New York, NY","Husband, Father, Columnist & Commentator. Auth...",2009-04-16 20:06:23,2253,1677,24,True,2020-07-25 12:27:17,Hey @Yankees @YankeesPR and @MLB - wouldn't it...,,Twitter for Android,False
2,Time4fisticuffs,"Pewee Valley, KY",#Christian #Catholic #Conservative #Reagan #Re...,2009-02-28 18:57:41,9275,9525,7254,False,2020-07-25 12:27:14,@diane3443 @wdunlap @realDonaldTrump Trump nev...,['COVID19'],Twitter for Android,False
3,ethel mertz,Stuck in the Middle,#Browns #Indians #ClevelandProud #[]_[] #Cavs ...,2019-03-07 01:45:06,197,987,1488,False,2020-07-25 12:27:10,@brookbanktv The one gift #COVID19 has give me...,['COVID19'],Twitter for iPhone,False
4,DIPR-J&K,Jammu and Kashmir,🖊️Official Twitter handle of Department of Inf...,2017-02-12 06:45:15,101009,168,101,False,2020-07-25 12:27:08,25 July : Media Bulletin on Novel #CoronaVirus...,"['CoronaVirusUpdates', 'COVID19']",Twitter for Android,False


In [16]:
df.dtypes

user_name                   string
user_location               string
user_description            string
user_created        datetime64[ns]
user_followers               int64
user_friends                 int64
user_favourites              int64
user_verified                 bool
date                datetime64[ns]
text                        string
hashtags                    string
source                      string
is_retweet                    bool
dtype: object

## Análisis
1.- Ejecuta una función que calcule cuantos días transcurridos han pasado hasta
el día que se ejecute, desde la primera vez que un usuario publicó un tweet
acerca del CoronaVirus.