## Procesamiento de datos:

En primer lugar, se cargan los datos mediante la librería `ujson`. Para visualizar y manejar de mejor manera los datos, se utiliza la librería `pandas`.

In [1]:
'''
Código obtenido de https://stackoverflow.com/questions/63501251/how-to-open-ndjson-file-in-python
'''
# %pip install ujson
# %pip install pandas

import ujson as json
import pandas as pd

records = map(json.loads, open('data/farmers-protest-tweets-2021-03-5.json'))
df = pd.DataFrame.from_records(records)

In [2]:
df.head(3)

Unnamed: 0,url,date,content,renderedContent,id,user,outlinks,tcooutlinks,replyCount,retweetCount,...,quoteCount,conversationId,lang,source,sourceUrl,sourceLabel,media,retweetedTweet,quotedTweet,mentionedUsers
0,https://twitter.com/ShashiRajbhar6/status/1376...,2021-03-30T03:33:46+00:00,Support 👇\n\n#FarmersProtest,Support 👇\n\n#FarmersProtest,1376739399593910273,"{'username': 'ShashiRajbhar6', 'displayname': ...",[],[],0,0,...,0,1376739399593910273,en,"<a href=""http://twitter.com/download/android"" ...",http://twitter.com/download/android,Twitter for Android,,,,
1,https://twitter.com/kaursuk06272818/status/137...,2021-03-30T03:33:23+00:00,Supporting farmers means supporting our countr...,Supporting farmers means supporting our countr...,1376739306287427584,"{'username': 'kaursuk06272818', 'displayname':...",[],[],0,0,...,0,1376739306287427584,en,"<a href=""http://twitter.com/download/android"" ...",http://twitter.com/download/android,Twitter for Android,[{'previewUrl': 'https://pbs.twimg.com/media/E...,,,
2,https://twitter.com/kaursuk06272818/status/137...,2021-03-30T03:31:00+00:00,Support farmers if you are related to food #St...,Support farmers if you are related to food #St...,1376738704128020488,"{'username': 'kaursuk06272818', 'displayname':...",[],[],0,0,...,0,1376738704128020488,en,"<a href=""http://twitter.com/download/android"" ...",http://twitter.com/download/android,Twitter for Android,[{'previewUrl': 'https://pbs.twimg.com/media/E...,,,


## Top 10 tweets más retweeted:

Fuente: https://datascientyst.com/get-top-10-highest-lowest-values-pandas/

In [20]:
def top_tweets(top_n):
    top_tweet_info = df.nlargest(n=top_n, columns=['retweetCount'])
    print(f'Los top {top_n} tweets más retwiteados son los siguientes:')
    return top_tweet_info

In [4]:
# top_tweets(df, 10)

## Top 10 usuarios con más tweets:

Fuentes: \
https://stackoverflow.com/questions/11902665/top-values-from-dictionary \
https://stackoverflow.com/questions/28663856/how-to-count-the-occurrence-of-certain-item-in-an-ndarray

In [22]:
import numpy as np

users_np = df['user'].to_numpy()

users = []
for user in users_np:
    users.append(user['username'])

unique, counts = np.unique(np.array(users), return_counts=True)
dict_tweets = dict(zip(unique, counts))

def keyfunction_tweets(k):
    return dict_tweets[k]

def top_users(top_n):
    print(f'Los top {top_n} usuarios que han emitido más tweets son:')
    for key in sorted(dict_tweets, key=keyfunction_tweets, reverse=True)[:top_n]:
        print(f'{key}: {dict_tweets[key]} tweets')

In [6]:
# top_users(10)

## Top 10 días con más tweets

In [23]:
dates_np = df['date'].to_numpy()

date_list = []
for date in dates_np:
    date_list.append(date.split("T")[0])

unique, counts = np.unique(np.array(date_list), return_counts=True)
dict_dates = dict(zip(unique, counts))

def keyfunction_dates(k):
    return dict_dates[k]

def top_dates(top_n):
    print(f'Los top {top_n} días con más tweets son:')
    for key in sorted(dict_dates, key=keyfunction_dates, reverse=True)[:top_n]:
        print(f'{key}: {dict_dates[key]} tweets')

In [8]:
# top_dates(10)

## Top 10 hashtags más usados:

Fuente: https://stackoverflow.com/questions/2678666/regex-to-find-words-that-start-with-a-specific-character

In [24]:
import re

content_np = df['content'].to_numpy()

lista_hashtags = []
for content in content_np:
    finded = re.findall("(?<!\w)#\w+", content)
    for elem in finded:
        lista_hashtags.append(elem.upper())

unique, counts = np.unique(np.array(lista_hashtags), return_counts=True)
dict_hashtags = dict(zip(unique, counts))

def keyfunction_hashtags(k):
    return dict_hashtags[k]

def top_hashtags(top_n):
    print(f'Los top {top_n} hashtags más usados son son:')
    for key in sorted(dict_hashtags, key=keyfunction_hashtags, reverse=True)[:top_n]:
        print(f'{key}: {dict_hashtags[key]} veces')

In [10]:
# top_hashtags(10)

## Main

In [25]:
def main(funct):
    if funct == "tweet":
        return top_tweets(10)
    elif funct == "user":
        return top_users(10)
    elif funct == "date":
        return top_dates(10)
    elif funct == "hashtag":
        return top_hashtags(10)
    else:
        return "Debes especificar la función: tweet, user, date o hashtag."

In [26]:
main("tweet")

Los top 10 tweets más retwiteados son los siguientes:


Unnamed: 0,url,date,content,renderedContent,id,user,outlinks,tcooutlinks,replyCount,retweetCount,...,quoteCount,conversationId,lang,source,sourceUrl,sourceLabel,media,retweetedTweet,quotedTweet,mentionedUsers
408128,https://twitter.com/rihanna/status/13566258896...,2021-02-02T15:29:51+00:00,why aren’t we talking about this?! #FarmersPro...,why aren’t we talking about this?! #FarmersPro...,1356625889602199552,"{'username': 'rihanna', 'displayname': 'Rihann...",[https://www.cnn.com/2021/02/01/asia/india-int...,[https://t.co/obmIlXhK9S],163065,315547,...,45832,1356625889602199552,en,"<a href=""http://twitter.com/download/iphone"" r...",http://twitter.com/download/iphone,Twitter for iPhone,,,,
395142,https://twitter.com/GretaThunberg/status/13566...,2021-02-02T20:04:01+00:00,We stand in solidarity with the #FarmersProtes...,We stand in solidarity with the #FarmersProtes...,1356694884615340037,"{'username': 'GretaThunberg', 'displayname': '...",[https://www.cnn.com/2021/02/01/asia/india-int...,[https://t.co/tqvR0oHgo0],49793,103957,...,13815,1356694884615340037,en,"<a href=""http://twitter.com/download/iphone"" r...",http://twitter.com/download/iphone,Twitter for iPhone,,,,
266196,https://twitter.com/GretaThunberg/status/13572...,2021-02-04T10:59:01+00:00,I still #StandWithFarmers and support their pe...,I still #StandWithFarmers and support their pe...,1357282507616645122,"{'username': 'GretaThunberg', 'displayname': '...",[],[],39596,67694,...,10587,1357282507616645122,en,"<a href=""http://twitter.com/download/iphone"" r...",http://twitter.com/download/iphone,Twitter for iPhone,,,,
366579,https://twitter.com/miakhalifa/status/13568483...,2021-02-03T06:14:01+00:00,"“Paid actors,” huh? Quite the casting director...","“Paid actors,” huh? Quite the casting director...",1356848397899112448,"{'username': 'miakhalifa', 'displayname': 'Mia...",[],[],15569,35921,...,5681,1356848397899112448,en,"<a href=""http://twitter.com/download/iphone"" r...",http://twitter.com/download/iphone,Twitter for iPhone,[{'previewUrl': 'https://pbs.twimg.com/media/E...,,,
372793,https://twitter.com/miakhalifa/status/13568277...,2021-02-03T04:51:48+00:00,What in the human rights violations is going o...,What in the human rights violations is going o...,1356827705161879553,"{'username': 'miakhalifa', 'displayname': 'Mia...",[],[],9082,26972,...,4606,1356827705161879553,en,"<a href=""http://twitter.com/download/iphone"" r...",http://twitter.com/download/iphone,Twitter for iPhone,[{'previewUrl': 'https://pbs.twimg.com/media/E...,,,
314192,https://twitter.com/TeamJuJu/status/1357048037...,2021-02-03T19:27:19+00:00,"Happy to share that I’ve donated $10,000 to pr...","Happy to share that I’ve donated $10,000 to pr...",1357048037302960129,"{'username': 'TeamJuJu', 'displayname': 'JuJu ...",[https://www.usnews.com/news/world/articles/20...,[https://t.co/0WoEw0l3ij],7683,23251,...,4082,1357048037302960129,en,"<a href=""http://twitter.com/download/iphone"" r...",http://twitter.com/download/iphone,Twitter for iPhone,,,,
215034,https://twitter.com/BobBlackman/status/1357755...,2021-02-05T18:19:19+00:00,There has been much social media coverage arou...,There has been much social media coverage arou...,1357755699162398720,"{'username': 'BobBlackman', 'displayname': 'Bo...",[],[],1845,20132,...,1592,1357755699162398720,en,"<a href=""https://mobile.twitter.com"" rel=""nofo...",https://mobile.twitter.com,Twitter Web App,[{'previewUrl': 'https://pbs.twimg.com/media/E...,,,
398011,https://twitter.com/vanessa_vash/status/135668...,2021-02-02T19:09:23+00:00,Farmers feed the world. Fight for them. Protec...,Farmers feed the world. Fight for them. Protec...,1356681136655769605,"{'username': 'vanessa_vash', 'displayname': 'V...",[],[],1301,18744,...,820,1356681136655769605,en,"<a href=""http://twitter.com/download/android"" ...",http://twitter.com/download/android,Twitter for Android,,,,
325261,https://twitter.com/kylekuzma/status/135700972...,2021-02-03T16:55:04+00:00,Should be talking about this! #FarmersProtest\...,Should be talking about this! #FarmersProtest\...,1357009721090138112,"{'username': 'kylekuzma', 'displayname': 'kuz'...",[https://www.cnn.com/2021/02/01/asia/india-int...,[https://t.co/Xh09iTvVoF],4167,17368,...,2505,1357009721090138112,en,"<a href=""http://twitter.com/download/iphone"" r...",http://twitter.com/download/iphone,Twitter for iPhone,,,,
163689,https://twitter.com/AmandaCerny/status/1359013...,2021-02-09T05:36:49+00:00,To all of my influencer/celeb friends- read up...,To all of my influencer/celeb friends- read up...,1359013362881994752,"{'username': 'AmandaCerny', 'displayname': 'Am...",[],[],2028,15677,...,813,1359013362881994752,en,"<a href=""http://twitter.com/download/iphone"" r...",http://twitter.com/download/iphone,Twitter for iPhone,,,,


In [27]:
main("user")

Los top 10 usuarios que han emitido más tweets son:
harjot_tweeting: 7134 tweets
tasveersandhu: 2091 tweets
shells_n_petals: 1991 tweets
jot__b: 1841 tweets
rebelpacifist: 1806 tweets
rumsomal: 1722 tweets
Iamjazzie96: 1502 tweets
Jass_k_G: 1460 tweets
DigitalKisanBot: 1453 tweets
z_khalique007: 1446 tweets


In [28]:
main("date")

Los top 10 días con más tweets son:
2021-02-03: 83866 tweets
2021-02-04: 58607 tweets
2021-02-05: 33332 tweets
2021-02-02: 28548 tweets
2021-02-06: 22420 tweets
2021-02-07: 11325 tweets
2021-02-09: 9320 tweets
2021-02-08: 8920 tweets
2021-02-10: 7973 tweets
2021-02-11: 5698 tweets


In [29]:
main("hashtag")

Los top 10 hashtags más usados son son:
#FARMERSPROTEST: 425951 veces
#ISTANDWITHFARMERS: 17086 veces
#INDIANFARMERSHUMANRIGHTS: 12176 veces
#STANDWITHFARMERS: 11468 veces
#FARMERSAREINDIA: 11139 veces
#RIHANNA: 9490 veces
#FARMERSPROTESTS: 8910 veces
#FARMERS: 8704 veces
#INDIA: 6629 veces
#SHAMEONBOLLYWOOD: 6445 veces


In [30]:
main("Ninguna de las anteriores")

'Debes especificar la función: tweet, user, date o hashtag.'