En este archivo puedes escribir lo que estimes conveniente. Te recomendamos detallar tu solución y todas las suposiciones que estás considerando. Aquí puedes ejecutar las funciones que definiste en los otros archivos de la carpeta src, medir el tiempo, memoria, etc.

In [1]:
file_path = "farmers-protest-tweets-2021-2-4.json"

# PREGUNTA 1

1. Las top 10 fechas donde hay más tweets. Mencionar el usuario (username) que más publicaciones tiene por cada uno de esos días. Debe incluir las siguientes funciones:
```python
def q1_time(file_path: str) -> List[Tuple[datetime.date, str]]:
```
```python
def q1_memory(file_path: str) -> List[Tuple[datetime.date, str]]:
```
```python
Returns: 
[(datetime.date(1999, 11, 15), "LATAM321"), (datetime.date(1999, 7, 15), "LATAM_CHI"), ...]
```
## PREGUNTA 1 - TIME (q1_time)
Para desarrollar la solución a este problema se utilizará una solución cloud basada en Google Cloud Platform. La función contendrá el proceso de ETL para llevar los datos desde el archivo json local hacia Google Cloud Storage y posteriormente modelar esa data en una tabla de bigquery que permita realizar consultas de manera rápida y eficiente.
### ETL
El proceso de extracción, transformación y carga de los archivos en GCP permite llevar los archivos a una plataforma de performance rápida y efectiva.
#### Credenciales
Este desarrollo se realizará utilizando Google Cloud, por lo que se crea un proyecto en GCP llamado "project-latam-challenge". En este proyecto se crea una service account llamada "sa-etl-latam-challenge" que será utilizada para realizar la carga de datos.
```python
import os

# Ruta a archivo de credenciales JSON de Google Cloud
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = "../credentials/project-latam-challenge-749ce1a96052.json"
```
#### Carga en Dataframe
Se carga en un dataframe el [archivo](https://drive.google.com/file/d/1ig2ngoXFTxP5Pa8muXo02mDTFexZzsis/view?usp=sharing) json declarado como parte del challenge.
```python
import pandas as pd

# Leer el archivo CSV en un DataFrame
df = pd.read_json(file_path,lines=True)

# Mostrar las primeras filas del DataFrame
df.head()
```
#### Carga de dataframe en GCP
##### Carga de archivo en Google Cloud Storage (GCS)
Usando Google Cloud SDK se crea un bucket llamado 'bucket-project-latam-challenge' en el proyecto 'project-latam-challenge' y se carga el archivo directamente a dicho bucket.
```bash
gcloud auth login
gcloud config set project project-latam-challenge

gsutil mb gs://bucket-project-latam-challenge/

gsutil cp farmers-protest-tweets-2021-2-4.json gs://bucket-project-latam-challenge/
```
##### Configuración de carga
```python
from google.cloud import bigquery

# Parámetros
project_id = 'project-latam-challenge'
dataset_id = 'twitter_data'
table_id = 'farmers_protest_tweets_2021'
```
##### Declaración de schema
Se declara explicitamente la estructura del esquema con la data correspondiente, esto permitirá evitar errores en los tipos de dato al cargar la data en Bigquery.
```python
schema = [
    bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the tweet"),
    bigquery.SchemaField(name="date", field_type="TIMESTAMP", mode="NULLABLE", description="Date and time of the tweet"),
    bigquery.SchemaField(name="content", field_type="STRING", mode="NULLABLE", description="Text content of the tweet"),
    bigquery.SchemaField(name="renderedContent", field_type="STRING", mode="NULLABLE", description="Rendered text content of the tweet"),
    bigquery.SchemaField(name="id", field_type="INTEGER", mode="REQUIRED", description="Unique identifier for the Tweet"),
    bigquery.SchemaField(name="user", field_type="RECORD", mode="NULLABLE", description="The user who posted this Tweet", fields=[
        bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the user"),
        bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the user"),
        bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="Unique identifier for the user"),
        bigquery.SchemaField(name="description", field_type="STRING", mode="NULLABLE", description="User's description"),
        bigquery.SchemaField(name="rawDescription", field_type="STRING", mode="NULLABLE", description="Raw description of the user"),
        bigquery.SchemaField(name="descriptionUrls", field_type="RECORD", mode="REPEATED", description="URLs in the user's description", fields=[
            bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL in the description"),
            bigquery.SchemaField(name="text", field_type="STRING", mode="NULLABLE", description="Text in the description"),
            bigquery.SchemaField(name="indices", field_type="INTEGER", mode="REPEATED", description="Indices in the description"),
            bigquery.SchemaField(name="tcourl", field_type="STRING", mode="NULLABLE", description="t.co URL in the description")
        ]),
        bigquery.SchemaField(name="verified", field_type="BOOLEAN", mode="NULLABLE", description="Whether the user is verified"),
        bigquery.SchemaField(name="created", field_type="TIMESTAMP", mode="NULLABLE", description="Account creation date"),
        bigquery.SchemaField(name="followersCount", field_type="INTEGER", mode="NULLABLE", description="Number of followers"),
        bigquery.SchemaField(name="friendsCount", field_type="INTEGER", mode="NULLABLE", description="Number of friends"),
        bigquery.SchemaField(name="statusesCount", field_type="INTEGER", mode="NULLABLE", description="Number of statuses"),
        bigquery.SchemaField(name="favouritesCount", field_type="INTEGER", mode="NULLABLE", description="Number of favourites"),
        bigquery.SchemaField(name="listedCount", field_type="INTEGER", mode="NULLABLE", description="Number of times listed"),
        bigquery.SchemaField(name="mediaCount", field_type="INTEGER", mode="NULLABLE", description="Number of media uploaded"),
        bigquery.SchemaField(name="location", field_type="STRING", mode="NULLABLE", description="Location of the user"),
        bigquery.SchemaField(name="protected", field_type="BOOLEAN", mode="NULLABLE", description="Whether the user's tweets are protected"),
        bigquery.SchemaField(name="linkUrl", field_type="STRING", mode="NULLABLE", description="User's link URL"),
        bigquery.SchemaField(name="linkTcourl", field_type="STRING", mode="NULLABLE", description="User's link t.co URL"),
        bigquery.SchemaField(name="profileImageUrl", field_type="STRING", mode="NULLABLE", description="Profile image URL"),
        bigquery.SchemaField(name="profileBannerUrl", field_type="STRING", mode="NULLABLE", description="Profile banner URL"),
        bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="User's URL")
    ]),
    bigquery.SchemaField(name="outlinks", field_type="STRING", mode="REPEATED", description="Outlinks in the tweet"),
    bigquery.SchemaField(name="tcooutlinks", field_type="STRING", mode="REPEATED", description="t.co outlinks in the tweet"),
    bigquery.SchemaField(name="replyCount", field_type="INTEGER", mode="NULLABLE", description="Number of replies"),
    bigquery.SchemaField(name="retweetCount", field_type="INTEGER", mode="NULLABLE", description="Number of retweets"),
    bigquery.SchemaField(name="likeCount", field_type="INTEGER", mode="NULLABLE", description="Number of likes"),
    bigquery.SchemaField(name="quoteCount", field_type="INTEGER", mode="NULLABLE", description="Number of quotes"),
    bigquery.SchemaField(name="conversationId", field_type="INTEGER", mode="NULLABLE", description="Conversation ID"),
    bigquery.SchemaField(name="lang", field_type="STRING", mode="NULLABLE", description="Language of the tweet"),
    bigquery.SchemaField(name="source", field_type="STRING", mode="NULLABLE", description="Source of the tweet"),
    bigquery.SchemaField(name="sourceUrl", field_type="STRING", mode="NULLABLE", description="Source URL"),
    bigquery.SchemaField(name="sourceLabel", field_type="STRING", mode="NULLABLE", description="Source label"),
    bigquery.SchemaField(name="media", field_type="RECORD", mode="REPEATED", description="Media attached to the tweet", fields=[
        bigquery.SchemaField(name="duration", field_type="FLOAT", mode="NULLABLE", description="Duration of the media"),
        bigquery.SchemaField(name="fullUrl", field_type="STRING", mode="NULLABLE", description="Full URL of the media"),
        bigquery.SchemaField(name="previewUrl", field_type="STRING", mode="NULLABLE", description="Preview URL of the media"),
        bigquery.SchemaField(name="thumbnailUrl", field_type="STRING", mode="NULLABLE", description="Thumbnail URL of the media"),
        bigquery.SchemaField(name="type", field_type="STRING", mode="NULLABLE", description="Type of media"),
        bigquery.SchemaField(name="variants", field_type="RECORD", mode="REPEATED", description="Variants of the media", fields=[
            bigquery.SchemaField(name="bitrate", field_type="INTEGER", mode="NULLABLE", description="Bitrate of the media variant"),
            bigquery.SchemaField(name="contentType", field_type="STRING", mode="NULLABLE", description="Content type of the media variant"),
            bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the media variant")
        ])
    ]),
    bigquery.SchemaField(name="retweetedTweet", field_type="RECORD", mode="NULLABLE", description="Retweeted tweet", fields=[
        bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the retweeted tweet"),
        bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the retweeted tweet"),
        bigquery.SchemaField(name="user", field_type="RECORD", mode="NULLABLE", description="User who posted the retweeted tweet", fields=[
            bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username user who posted the retweeted tweet"),
            bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name user who posted the retweeted tweet"),
            bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="Unique identifier for the user who posted the retweeted tweet"),
            bigquery.SchemaField(name="description", field_type="STRING", mode="NULLABLE", description="User's who posted the retweeted tweet description"),
            bigquery.SchemaField(name="rawDescription", field_type="STRING", mode="NULLABLE", description="Raw description user who posted the retweeted tweet"),
            bigquery.SchemaField(name="descriptionUrls", field_type="RECORD", mode="REPEATED", description="URLs in the user's description", fields=[
                bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL in the description"),
                bigquery.SchemaField(name="text", field_type="STRING", mode="NULLABLE", description="Text in the description"),
                bigquery.SchemaField(name="indices", field_type="INTEGER", mode="REPEATED", description="Indices in the description"),
                bigquery.SchemaField(name="tcourl", field_type="STRING", mode="NULLABLE", description="t.co URL in the description")
            ]),
            bigquery.SchemaField(name="verified", field_type="BOOLEAN", mode="NULLABLE", description="Whether the user who retweeted tweet description is verified"),
            bigquery.SchemaField(name="created", field_type="TIMESTAMP", mode="NULLABLE", description="Account creation date of the user who posted the retweeted tweet"),
            bigquery.SchemaField(name="followersCount", field_type="INTEGER", mode="NULLABLE", description="Number of followers of the user who posted the retweeted tweet"),
            bigquery.SchemaField(name="friendsCount", field_type="INTEGER", mode="NULLABLE", description="Number of friends of the user who posted the retweeted tweet"),
            bigquery.SchemaField(name="statusesCount", field_type="INTEGER", mode="NULLABLE", description="Number of statuses of the user who posted the retweeted tweet"),
            bigquery.SchemaField(name="favouritesCount", field_type="INTEGER", mode="NULLABLE", description="Number of favourites of the user who posted the retweeted tweet"),
            bigquery.SchemaField(name="listedCount", field_type="INTEGER", mode="NULLABLE", description="Number of times listed of the user who posted the retweeted tweet"),
            bigquery.SchemaField(name="mediaCount", field_type="INTEGER", mode="NULLABLE", description="Number of media uploaded of the user who posted the retweeted tweet"),
            bigquery.SchemaField(name="location", field_type="STRING", mode="NULLABLE", description="Location user who posted the retweeted tweet"),
            bigquery.SchemaField(name="protected", field_type="BOOLEAN", mode="NULLABLE", description="Whether the user's who posted the retweeted tweet tweets are protected"),
            bigquery.SchemaField(name="linkUrl", field_type="STRING", mode="NULLABLE", description="User's who posted the retweeted tweet link URL"),
            bigquery.SchemaField(name="linkTcourl", field_type="STRING", mode="NULLABLE", description="User's link t.co URL"),
            bigquery.SchemaField(name="profileImageUrl", field_type="STRING", mode="NULLABLE", description="Profile image URL"),
            bigquery.SchemaField(name="profileBannerUrl", field_type="STRING", mode="NULLABLE", description="Profile banner URL"),
            bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="User's URL")
        ]),
        bigquery.SchemaField(name="date", field_type="TIMESTAMP", mode="NULLABLE", description="Date and time of the retweeted tweet"),
        bigquery.SchemaField(name="content", field_type="STRING", mode="NULLABLE", description="Content of the retweeted tweet"),
    ]),
    bigquery.SchemaField(name="quotedTweet", field_type="RECORD", mode="NULLABLE", description="Quoted tweet", fields=[
        bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the quoted tweet"),
        bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the quoted tweet"),
        bigquery.SchemaField(name="user", field_type="RECORD", mode="NULLABLE", description="User who posted the quoted tweet", fields=[
            bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="description", field_type="STRING", mode="NULLABLE", description="Description of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="rawDescription", field_type="STRING", mode="NULLABLE", description="Raw description of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="descriptionUrls", field_type="RECORD", mode="REPEATED", description="URLs in the user's description", fields=[
                bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL in the description"),
                bigquery.SchemaField(name="text", field_type="STRING", mode="NULLABLE", description="Text in the description"),
                bigquery.SchemaField(name="indices", field_type="INTEGER", mode="REPEATED", description="Indices in the description"),
                bigquery.SchemaField(name="tcourl", field_type="STRING", mode="NULLABLE", description="t.co URL in the description")
            ]),
            bigquery.SchemaField(name="verified", field_type="BOOLEAN", mode="NULLABLE", description="Whether the user who posted the quoted tweet is verified"),
            bigquery.SchemaField(name="created", field_type="TIMESTAMP", mode="NULLABLE", description="Account creation date of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="followersCount", field_type="INTEGER", mode="NULLABLE", description="Number of followers of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="friendsCount", field_type="INTEGER", mode="NULLABLE", description="Number of friends of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="statusesCount", field_type="INTEGER", mode="NULLABLE", description="Number of statuses of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="favouritesCount", field_type="INTEGER", mode="NULLABLE", description="Number of favourites of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="listedCount", field_type="INTEGER", mode="NULLABLE", description="Number of times listed of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="mediaCount", field_type="INTEGER", mode="NULLABLE", description="Number of media uploaded of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="location", field_type="STRING", mode="NULLABLE", description="Location of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="protected", field_type="BOOLEAN", mode="NULLABLE", description="Whether the user's who posted the quoted tweet tweets are protected"),
            bigquery.SchemaField(name="linkUrl", field_type="STRING", mode="NULLABLE", description="User's link URL of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="linkTcourl", field_type="STRING", mode="NULLABLE", description="User's link t.co URL of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="profileImageUrl", field_type="STRING", mode="NULLABLE", description="Profile image URL of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="profileBannerUrl", field_type="STRING", mode="NULLABLE", description="Profile banner URL of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the user who posted the quoted tweet")
        ]),
        bigquery.SchemaField(name="date", field_type="TIMESTAMP", mode="NULLABLE", description="Date and time of the quoted tweet"),
        bigquery.SchemaField(name="content", field_type="STRING", mode="NULLABLE", description="Content of the quoted tweet"),
        bigquery.SchemaField(name="renderedContent", field_type="STRING", mode="NULLABLE", description="Rendered content of the quoted tweet"),
        bigquery.SchemaField(name="conversationId", field_type="INTEGER", mode="NULLABLE", description="Conversation ID of the quoted tweet"),
        bigquery.SchemaField(name="lang", field_type="STRING", mode="NULLABLE", description="Language of the quoted tweet"),
        bigquery.SchemaField(name="likeCount", field_type="INTEGER", mode="NULLABLE", description="Number of likes of the quoted tweet"),
        bigquery.SchemaField(name="media", field_type="RECORD", mode="REPEATED", description="Media attached to the quoted tweet", fields=[
            bigquery.SchemaField(name="duration", field_type="FLOAT", mode="NULLABLE", description="Duration of the media"),
            bigquery.SchemaField(name="fullUrl", field_type="STRING", mode="NULLABLE", description="Full URL of the media"),
            bigquery.SchemaField(name="previewUrl", field_type="STRING", mode="NULLABLE", description="Preview URL of the media"),
            bigquery.SchemaField(name="thumbnailUrl", field_type="STRING", mode="NULLABLE", description="Thumbnail URL of the media"),
            bigquery.SchemaField(name="type", field_type="STRING", mode="NULLABLE", description="Type of media"),
            bigquery.SchemaField(name="variants", field_type="RECORD", mode="REPEATED", description="Variants of the media", fields=[
                bigquery.SchemaField(name="bitrate", field_type="INTEGER", mode="NULLABLE", description="Bitrate of the media variant"),
                bigquery.SchemaField(name="contentType", field_type="STRING", mode="NULLABLE", description="Content type of the media variant"),
                bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the media variant")
            ])
        ]),
        bigquery.SchemaField(name="outlinks", field_type="STRING", mode="REPEATED", description="Outlinks in the quoted tweet"),
        bigquery.SchemaField(name="quoteCount", field_type="INTEGER", mode="NULLABLE", description="Number of quotes of the quoted tweet"),
        bigquery.SchemaField(name="replyCount", field_type="INTEGER", mode="NULLABLE", description="Number of replies of the quoted tweet"),
        bigquery.SchemaField(name="retweetCount", field_type="INTEGER", mode="NULLABLE", description="Number of retweets of the quoted tweet"),
        bigquery.SchemaField(name="source", field_type="STRING", mode="NULLABLE", description="Source of the quoted tweet"),
        bigquery.SchemaField(name="sourceLabel", field_type="STRING", mode="NULLABLE", description="Source label of the quoted tweet"),
        bigquery.SchemaField(name="sourceUrl", field_type="STRING", mode="NULLABLE", description="Source URL of the quoted tweet"),
        bigquery.SchemaField(name="tcooutlinks", field_type="STRING", mode="REPEATED", description="t.co outlinks in the quoted tweet"),
        bigquery.SchemaField(name="mentionedUsers", field_type="RECORD", mode="REPEATED", description="Users mentioned in the quoted tweet", fields=[
            bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the mentioned user"),
            bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the mentioned user"),
            bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the mentioned user"),
            bigquery.SchemaField(name="description", field_type="STRING", mode="NULLABLE", description="Description of the mentioned user"),
            bigquery.SchemaField(name="rawDescription", field_type="STRING", mode="NULLABLE", description="Raw description of the mentioned user"),
            bigquery.SchemaField(name="descriptionUrls", field_type="RECORD", mode="REPEATED", description="URLs in the description of the mentioned user", fields=[
                bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL in the description"),
                bigquery.SchemaField(name="text", field_type="STRING", mode="NULLABLE", description="Text in the description"),
                bigquery.SchemaField(name="indices", field_type="INTEGER", mode="REPEATED", description="Indices in the description"),
                bigquery.SchemaField(name="tcourl", field_type="STRING", mode="NULLABLE", description="t.co URL in the description")
            ]),
            bigquery.SchemaField(name="verified", field_type="BOOLEAN", mode="NULLABLE", description="Whether the mentioned user is verified"),
            bigquery.SchemaField(name="created", field_type="TIMESTAMP", mode="NULLABLE", description="Account creation date of the mentioned user"),
            bigquery.SchemaField(name="followersCount", field_type="INTEGER", mode="NULLABLE", description="Number of followers of the mentioned user"),
            bigquery.SchemaField(name="friendsCount", field_type="INTEGER", mode="NULLABLE", description="Number of friends of the mentioned user"),
            bigquery.SchemaField(name="statusesCount", field_type="INTEGER", mode="NULLABLE", description="Number of statuses of the mentioned user"),
            bigquery.SchemaField(name="favouritesCount", field_type="INTEGER", mode="NULLABLE", description="Number of favourites of the mentioned user"),
            bigquery.SchemaField(name="listedCount", field_type="INTEGER", mode="NULLABLE", description="Number of times the mentioned user is listed"),
            bigquery.SchemaField(name="mediaCount", field_type="INTEGER", mode="NULLABLE", description="Number of media uploaded by the mentioned user"),
            bigquery.SchemaField(name="location", field_type="STRING", mode="NULLABLE", description="Location of the mentioned user"),
            bigquery.SchemaField(name="protected", field_type="BOOLEAN", mode="NULLABLE", description="Whether the mentioned user's tweets are protected"),
            bigquery.SchemaField(name="linkUrl", field_type="STRING", mode="NULLABLE", description="Link URL of the mentioned user"),
            bigquery.SchemaField(name="linkTcourl", field_type="STRING", mode="NULLABLE", description="Link t.co URL of the mentioned user"),
            bigquery.SchemaField(name="profileImageUrl", field_type="STRING", mode="NULLABLE", description="Profile image URL of the mentioned user"),
            bigquery.SchemaField(name="profileBannerUrl", field_type="STRING", mode="NULLABLE", description="Profile banner URL of the mentioned user"),
            bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the mentioned user")
        ]),
        bigquery.SchemaField(name="retweetedTweet", field_type="RECORD", mode="NULLABLE", description="Retweeted tweet within the quoted tweet", fields=[
            bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the retweeted tweet within the quoted tweet"),
            bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the retweeted tweet within the quoted tweet"),
            bigquery.SchemaField(name="user", field_type="RECORD", mode="NULLABLE", description="User who posted the retweeted tweet within the quoted tweet", fields=[
                bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the user who posted the retweeted tweet within the quoted tweet"),
                bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the user who posted the retweeted tweet within the quoted tweet"),
                bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the user who posted the retweeted tweet within the quoted tweet"),
            ]),
            bigquery.SchemaField(name="date", field_type="TIMESTAMP", mode="NULLABLE", description="Date and time of the retweeted tweet within the quoted tweet"),
            bigquery.SchemaField(name="content", field_type="STRING", mode="NULLABLE", description="Content of the retweeted tweet within the quoted tweet"),
                    ]),
        bigquery.SchemaField(name="quotedTweet", field_type="RECORD", mode="NULLABLE", description="Quoted tweet within the quoted tweet", fields=[
            bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="user", field_type="RECORD", mode="NULLABLE", description="User who posted the quoted tweet within the quoted tweet", fields=[
                bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="description", field_type="STRING", mode="NULLABLE", description="Description of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="rawDescription", field_type="STRING", mode="NULLABLE", description="Raw description of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="descriptionUrls", field_type="RECORD", mode="REPEATED", description="URLs in the user's description", fields=[
                    bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL in the description"),
                    bigquery.SchemaField(name="text", field_type="STRING", mode="NULLABLE", description="Text in the description"),
                    bigquery.SchemaField(name="indices", field_type="INTEGER", mode="REPEATED", description="Indices in the description"),
                    bigquery.SchemaField(name="tcourl", field_type="STRING", mode="NULLABLE", description="t.co URL in the description")
                ]),
                bigquery.SchemaField(name="verified", field_type="BOOLEAN", mode="NULLABLE", description="Whether the user who posted the quoted tweet within the quoted tweet is verified"),
                bigquery.SchemaField(name="created", field_type="TIMESTAMP", mode="NULLABLE", description="Account creation date of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="followersCount", field_type="INTEGER", mode="NULLABLE", description="Number of followers of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="friendsCount", field_type="INTEGER", mode="NULLABLE", description="Number of friends of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="statusesCount", field_type="INTEGER", mode="NULLABLE", description="Number of statuses of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="favouritesCount", field_type="INTEGER", mode="NULLABLE", description="Number of favourites of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="listedCount", field_type="INTEGER", mode="NULLABLE", description="Number of times listed of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="mediaCount", field_type="INTEGER", mode="NULLABLE", description="Number of media uploaded of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="location", field_type="STRING", mode="NULLABLE", description="Location of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="protected", field_type="BOOLEAN", mode="NULLABLE", description="Whether the user's who posted the quoted tweet within the quoted tweet tweets are protected"),
                bigquery.SchemaField(name="linkUrl", field_type="STRING", mode="NULLABLE", description="User's link URL of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="linkTcourl", field_type="STRING", mode="NULLABLE", description="User's link t.co URL of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="profileImageUrl", field_type="STRING", mode="NULLABLE", description="Profile image URL of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="profileBannerUrl", field_type="STRING", mode="NULLABLE", description="Profile banner URL of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the user who posted the quoted tweet within the quoted tweet")
            ]),
            bigquery.SchemaField(name="date", field_type="TIMESTAMP", mode="NULLABLE", description="Date and time of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="content", field_type="STRING", mode="NULLABLE", description="Content of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="renderedContent", field_type="STRING", mode="NULLABLE", description="Rendered content of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="conversationId", field_type="INTEGER", mode="NULLABLE", description="Conversation ID of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="lang", field_type="STRING", mode="NULLABLE", description="Language of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="likeCount", field_type="INTEGER", mode="NULLABLE", description="Number of likes of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="media", field_type="RECORD", mode="REPEATED", description="Media attached to the quoted tweet within the quoted tweet", fields=[
                bigquery.SchemaField(name="duration", field_type="FLOAT", mode="NULLABLE", description="Duration of the media"),
                bigquery.SchemaField(name="fullUrl", field_type="STRING", mode="NULLABLE", description="Full URL of the media"),
                bigquery.SchemaField(name="previewUrl", field_type="STRING", mode="NULLABLE", description="Preview URL of the media"),
                bigquery.SchemaField(name="thumbnailUrl", field_type="STRING", mode="NULLABLE", description="Thumbnail URL of the media"),
                bigquery.SchemaField(name="type", field_type="STRING", mode="NULLABLE", description="Type of media"),
                bigquery.SchemaField(name="variants", field_type="RECORD", mode="REPEATED", description="Variants of the media", fields=[
                    bigquery.SchemaField(name="bitrate", field_type="INTEGER", mode="NULLABLE", description="Bitrate of the media variant"),
                    bigquery.SchemaField(name="contentType", field_type="STRING", mode="NULLABLE", description="Content type of the media variant"),
                    bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the media variant")
                ])
            ]),
            bigquery.SchemaField(name="outlinks", field_type="STRING", mode="REPEATED", description="Outlinks in the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="quoteCount", field_type="INTEGER", mode="NULLABLE", description="Number of quotes of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="replyCount", field_type="INTEGER", mode="NULLABLE", description="Number of replies of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="retweetCount", field_type="INTEGER", mode="NULLABLE", description="Number of retweets of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="source", field_type="STRING", mode="NULLABLE", description="Source of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="sourceLabel", field_type="STRING", mode="NULLABLE", description="Source label of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="sourceUrl", field_type="STRING", mode="NULLABLE", description="Source URL of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="tcooutlinks", field_type="STRING", mode="REPEATED", description="t.co outlinks in the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="mentionedUsers", field_type="RECORD", mode="REPEATED", description="Users mentioned in the quoted tweet", fields=[
                bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the mentioned user"),
                bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the mentioned user"),
                bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the mentioned user"),
                bigquery.SchemaField(name="description", field_type="STRING", mode="NULLABLE", description="Description of the mentioned user"),
                bigquery.SchemaField(name="rawDescription", field_type="STRING", mode="NULLABLE", description="Raw description of the mentioned user"),
                bigquery.SchemaField(name="descriptionUrls", field_type="RECORD", mode="REPEATED", description="URLs in the description of the mentioned user", fields=[
                    bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL in the description"),
                    bigquery.SchemaField(name="text", field_type="STRING", mode="NULLABLE", description="Text in the description"),
                    bigquery.SchemaField(name="indices", field_type="INTEGER", mode="REPEATED", description="Indices in the description"),
                    bigquery.SchemaField(name="tcourl", field_type="STRING", mode="NULLABLE", description="t.co URL in the description")
                ]),
                bigquery.SchemaField(name="verified", field_type="BOOLEAN", mode="NULLABLE", description="Whether the mentioned user is verified"),
                bigquery.SchemaField(name="created", field_type="TIMESTAMP", mode="NULLABLE", description="Account creation date of the mentioned user"),
                bigquery.SchemaField(name="followersCount", field_type="INTEGER", mode="NULLABLE", description="Number of followers of the mentioned user"),
                bigquery.SchemaField(name="friendsCount", field_type="INTEGER", mode="NULLABLE", description="Number of friends of the mentioned user"),
                bigquery.SchemaField(name="statusesCount", field_type="INTEGER", mode="NULLABLE", description="Number of statuses of the mentioned user"),
                bigquery.SchemaField(name="favouritesCount", field_type="INTEGER", mode="NULLABLE", description="Number of favourites of the mentioned user"),
                bigquery.SchemaField(name="listedCount", field_type="INTEGER", mode="NULLABLE", description="Number of times the mentioned user is listed"),
                bigquery.SchemaField(name="mediaCount", field_type="INTEGER", mode="NULLABLE", description="Number of media uploaded by the mentioned user"),
                bigquery.SchemaField(name="location", field_type="STRING", mode="NULLABLE", description="Location of the mentioned user"),
                bigquery.SchemaField(name="protected", field_type="BOOLEAN", mode="NULLABLE", description="Whether the mentioned user's tweets are protected"),
                bigquery.SchemaField(name="linkUrl", field_type="STRING", mode="NULLABLE", description="Link URL of the mentioned user"),
                bigquery.SchemaField(name="linkTcourl", field_type="STRING", mode="NULLABLE", description="Link t.co URL of the mentioned user"),
                bigquery.SchemaField(name="profileImageUrl", field_type="STRING", mode="NULLABLE", description="Profile image URL of the mentioned user"),
                bigquery.SchemaField(name="profileBannerUrl", field_type="STRING", mode="NULLABLE", description="Profile banner URL of the mentioned user"),
                bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the mentioned user")
            ]),
            bigquery.SchemaField(name="retweetedTweet", field_type="RECORD", mode="NULLABLE", description="Retweeted tweet within the quoted tweet within the quoted tweet", fields=[
                bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the retweeted tweet within the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the retweeted tweet within the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="user", field_type="RECORD", mode="NULLABLE", description="User who posted the retweeted tweet within the quoted tweet within the quoted tweet", fields=[
                    bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the user who posted the retweeted tweet within the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the user who posted the retweeted tweet within the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the user who posted the retweeted tweet within the quoted tweet within the quoted tweet"),
                ]),
                bigquery.SchemaField(name="date", field_type="TIMESTAMP", mode="NULLABLE", description="Date and time of the retweeted tweet within the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="content", field_type="STRING", mode="NULLABLE", description="Content of the retweeted tweet within the quoted tweet within the quoted tweet"),
            ]),            
            bigquery.SchemaField(name="quotedTweet", field_type="RECORD", mode="NULLABLE", description="Quoted tweet", fields=[
                bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the quoted tweet"),
                bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the quoted tweet"),
                bigquery.SchemaField(name="user", field_type="RECORD", mode="NULLABLE", description="User who posted the quoted tweet", fields=[
                    bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="description", field_type="STRING", mode="NULLABLE", description="Description of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="rawDescription", field_type="STRING", mode="NULLABLE", description="Raw description of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="descriptionUrls", field_type="RECORD", mode="REPEATED", description="URLs in the user's description", fields=[
                        bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL in the description"),
                        bigquery.SchemaField(name="text", field_type="STRING", mode="NULLABLE", description="Text in the description"),
                        bigquery.SchemaField(name="indices", field_type="INTEGER", mode="REPEATED", description="Indices in the description"),
                        bigquery.SchemaField(name="tcourl", field_type="STRING", mode="NULLABLE", description="t.co URL in the description")
                    ]),
                    bigquery.SchemaField(name="verified", field_type="BOOLEAN", mode="NULLABLE", description="Whether the user who posted the quoted tweet is verified"),
                    bigquery.SchemaField(name="created", field_type="TIMESTAMP", mode="NULLABLE", description="Account creation date of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="followersCount", field_type="INTEGER", mode="NULLABLE", description="Number of followers of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="friendsCount", field_type="INTEGER", mode="NULLABLE", description="Number of friends of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="statusesCount", field_type="INTEGER", mode="NULLABLE", description="Number of statuses of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="favouritesCount", field_type="INTEGER", mode="NULLABLE", description="Number of favourites of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="listedCount", field_type="INTEGER", mode="NULLABLE", description="Number of times listed of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="mediaCount", field_type="INTEGER", mode="NULLABLE", description="Number of media uploaded of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="location", field_type="STRING", mode="NULLABLE", description="Location of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="protected", field_type="BOOLEAN", mode="NULLABLE", description="Whether the user's who posted the quoted tweet tweets are protected"),
                    bigquery.SchemaField(name="linkUrl", field_type="STRING", mode="NULLABLE", description="User's link URL of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="linkTcourl", field_type="STRING", mode="NULLABLE", description="User's link t.co URL of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="profileImageUrl", field_type="STRING", mode="NULLABLE", description="Profile image URL of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="profileBannerUrl", field_type="STRING", mode="NULLABLE", description="Profile banner URL of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the user who posted the quoted tweet")
                ]),
                bigquery.SchemaField(name="date", field_type="TIMESTAMP", mode="NULLABLE", description="Date and time of the quoted tweet"),
                bigquery.SchemaField(name="content", field_type="STRING", mode="NULLABLE", description="Content of the quoted tweet"),
                bigquery.SchemaField(name="renderedContent", field_type="STRING", mode="NULLABLE", description="Rendered content of the quoted tweet"),
                bigquery.SchemaField(name="conversationId", field_type="INTEGER", mode="NULLABLE", description="Conversation ID of the quoted tweet"),
                bigquery.SchemaField(name="lang", field_type="STRING", mode="NULLABLE", description="Language of the quoted tweet"),
                bigquery.SchemaField(name="likeCount", field_type="INTEGER", mode="NULLABLE", description="Number of likes of the quoted tweet"),
                bigquery.SchemaField(name="media", field_type="RECORD", mode="REPEATED", description="Media attached to the quoted tweet", fields=[
                    bigquery.SchemaField(name="duration", field_type="FLOAT", mode="NULLABLE", description="Duration of the media"),
                    bigquery.SchemaField(name="fullUrl", field_type="STRING", mode="NULLABLE", description="Full URL of the media"),
                    bigquery.SchemaField(name="previewUrl", field_type="STRING", mode="NULLABLE", description="Preview URL of the media"),
                    bigquery.SchemaField(name="thumbnailUrl", field_type="STRING", mode="NULLABLE", description="Thumbnail URL of the media"),
                    bigquery.SchemaField(name="type", field_type="STRING", mode="NULLABLE", description="Type of media"),
                    bigquery.SchemaField(name="variants", field_type="RECORD", mode="REPEATED", description="Variants of the media", fields=[
                        bigquery.SchemaField(name="bitrate", field_type="INTEGER", mode="NULLABLE", description="Bitrate of the media variant"),
                        bigquery.SchemaField(name="contentType", field_type="STRING", mode="NULLABLE", description="Content type of the media variant"),
                        bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the media variant")
                    ])
                ]),
                bigquery.SchemaField(name="outlinks", field_type="STRING", mode="REPEATED", description="Outlinks in the quoted tweet"),
                bigquery.SchemaField(name="quoteCount", field_type="INTEGER", mode="NULLABLE", description="Number of quotes of the quoted tweet"),
                bigquery.SchemaField(name="replyCount", field_type="INTEGER", mode="NULLABLE", description="Number of replies of the quoted tweet"),
                bigquery.SchemaField(name="retweetCount", field_type="INTEGER", mode="NULLABLE", description="Number of retweets of the quoted tweet"),
                bigquery.SchemaField(name="source", field_type="STRING", mode="NULLABLE", description="Source of the quoted tweet"),
                bigquery.SchemaField(name="sourceLabel", field_type="STRING", mode="NULLABLE", description="Source label of the quoted tweet"),
                bigquery.SchemaField(name="sourceUrl", field_type="STRING", mode="NULLABLE", description="Source URL of the quoted tweet"),
                bigquery.SchemaField(name="tcooutlinks", field_type="STRING", mode="REPEATED", description="t.co outlinks in the quoted tweet"),
                bigquery.SchemaField(name="mentionedUsers", field_type="RECORD", mode="REPEATED", description="Users mentioned in the quoted tweet", fields=[
                    bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the mentioned user"),
                    bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the mentioned user"),
                    bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the mentioned user"),
                    bigquery.SchemaField(name="description", field_type="STRING", mode="NULLABLE", description="Description of the mentioned user"),
                    bigquery.SchemaField(name="rawDescription", field_type="STRING", mode="NULLABLE", description="Raw description of the mentioned user"),
                    bigquery.SchemaField(name="descriptionUrls", field_type="RECORD", mode="REPEATED", description="URLs in the description of the mentioned user", fields=[
                        bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL in the description"),
                        bigquery.SchemaField(name="text", field_type="STRING", mode="NULLABLE", description="Text in the description"),
                        bigquery.SchemaField(name="indices", field_type="INTEGER", mode="REPEATED", description="Indices in the description"),
                        bigquery.SchemaField(name="tcourl", field_type="STRING", mode="NULLABLE", description="t.co URL in the description")
                    ]),
                    bigquery.SchemaField(name="verified", field_type="BOOLEAN", mode="NULLABLE", description="Whether the mentioned user is verified"),
                    bigquery.SchemaField(name="created", field_type="TIMESTAMP", mode="NULLABLE", description="Account creation date of the mentioned user"),
                    bigquery.SchemaField(name="followersCount", field_type="INTEGER", mode="NULLABLE", description="Number of followers of the mentioned user"),
                    bigquery.SchemaField(name="friendsCount", field_type="INTEGER", mode="NULLABLE", description="Number of friends of the mentioned user"),
                    bigquery.SchemaField(name="statusesCount", field_type="INTEGER", mode="NULLABLE", description="Number of statuses of the mentioned user"),
                    bigquery.SchemaField(name="favouritesCount", field_type="INTEGER", mode="NULLABLE", description="Number of favourites of the mentioned user"),
                    bigquery.SchemaField(name="listedCount", field_type="INTEGER", mode="NULLABLE", description="Number of times the mentioned user is listed"),
                    bigquery.SchemaField(name="mediaCount", field_type="INTEGER", mode="NULLABLE", description="Number of media uploaded by the mentioned user"),
                    bigquery.SchemaField(name="location", field_type="STRING", mode="NULLABLE", description="Location of the mentioned user"),
                    bigquery.SchemaField(name="protected", field_type="BOOLEAN", mode="NULLABLE", description="Whether the mentioned user's tweets are protected"),
                    bigquery.SchemaField(name="linkUrl", field_type="STRING", mode="NULLABLE", description="Link URL of the mentioned user"),
                    bigquery.SchemaField(name="linkTcourl", field_type="STRING", mode="NULLABLE", description="Link t.co URL of the mentioned user"),
                    bigquery.SchemaField(name="profileImageUrl", field_type="STRING", mode="NULLABLE", description="Profile image URL of the mentioned user"),
                    bigquery.SchemaField(name="profileBannerUrl", field_type="STRING", mode="NULLABLE", description="Profile banner URL of the mentioned user"),
                    bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the mentioned user")
                ]),
                bigquery.SchemaField(name="retweetedTweet", field_type="RECORD", mode="NULLABLE", description="Retweeted tweet within the quoted tweet", fields=[
                    bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the retweeted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the retweeted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="user", field_type="RECORD", mode="NULLABLE", description="User who posted the retweeted tweet within the quoted tweet", fields=[
                        bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the user who posted the retweeted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the user who posted the retweeted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the user who posted the retweeted tweet within the quoted tweet"),
                    ]),
                    bigquery.SchemaField(name="date", field_type="TIMESTAMP", mode="NULLABLE", description="Date and time of the retweeted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="content", field_type="STRING", mode="NULLABLE", description="Content of the retweeted tweet within the quoted tweet"),
                            ]),
                bigquery.SchemaField(name="quotedTweet", field_type="RECORD", mode="NULLABLE", description="Quoted tweet within the quoted tweet", fields=[
                    bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="user", field_type="RECORD", mode="NULLABLE", description="User who posted the quoted tweet within the quoted tweet", fields=[
                        bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="description", field_type="STRING", mode="NULLABLE", description="Description of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="rawDescription", field_type="STRING", mode="NULLABLE", description="Raw description of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="descriptionUrls", field_type="RECORD", mode="REPEATED", description="URLs in the user's description", fields=[
                            bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL in the description"),
                            bigquery.SchemaField(name="text", field_type="STRING", mode="NULLABLE", description="Text in the description"),
                            bigquery.SchemaField(name="indices", field_type="INTEGER", mode="REPEATED", description="Indices in the description"),
                            bigquery.SchemaField(name="tcourl", field_type="STRING", mode="NULLABLE", description="t.co URL in the description")
                        ]),
                        bigquery.SchemaField(name="verified", field_type="BOOLEAN", mode="NULLABLE", description="Whether the user who posted the quoted tweet within the quoted tweet is verified"),
                        bigquery.SchemaField(name="created", field_type="TIMESTAMP", mode="NULLABLE", description="Account creation date of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="followersCount", field_type="INTEGER", mode="NULLABLE", description="Number of followers of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="friendsCount", field_type="INTEGER", mode="NULLABLE", description="Number of friends of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="statusesCount", field_type="INTEGER", mode="NULLABLE", description="Number of statuses of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="favouritesCount", field_type="INTEGER", mode="NULLABLE", description="Number of favourites of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="listedCount", field_type="INTEGER", mode="NULLABLE", description="Number of times listed of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="mediaCount", field_type="INTEGER", mode="NULLABLE", description="Number of media uploaded of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="location", field_type="STRING", mode="NULLABLE", description="Location of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="protected", field_type="BOOLEAN", mode="NULLABLE", description="Whether the user's who posted the quoted tweet within the quoted tweet tweets are protected"),
                        bigquery.SchemaField(name="linkUrl", field_type="STRING", mode="NULLABLE", description="User's link URL of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="linkTcourl", field_type="STRING", mode="NULLABLE", description="User's link t.co URL of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="profileImageUrl", field_type="STRING", mode="NULLABLE", description="Profile image URL of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="profileBannerUrl", field_type="STRING", mode="NULLABLE", description="Profile banner URL of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the user who posted the quoted tweet within the quoted tweet")
                    ]),
                    bigquery.SchemaField(name="date", field_type="TIMESTAMP", mode="NULLABLE", description="Date and time of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="content", field_type="STRING", mode="NULLABLE", description="Content of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="renderedContent", field_type="STRING", mode="NULLABLE", description="Rendered content of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="conversationId", field_type="INTEGER", mode="NULLABLE", description="Conversation ID of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="lang", field_type="STRING", mode="NULLABLE", description="Language of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="likeCount", field_type="INTEGER", mode="NULLABLE", description="Number of likes of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="media", field_type="RECORD", mode="REPEATED", description="Media attached to the quoted tweet within the quoted tweet", fields=[
                        bigquery.SchemaField(name="duration", field_type="FLOAT", mode="NULLABLE", description="Duration of the media"),
                        bigquery.SchemaField(name="fullUrl", field_type="STRING", mode="NULLABLE", description="Full URL of the media"),
                        bigquery.SchemaField(name="previewUrl", field_type="STRING", mode="NULLABLE", description="Preview URL of the media"),
                        bigquery.SchemaField(name="thumbnailUrl", field_type="STRING", mode="NULLABLE", description="Thumbnail URL of the media"),
                        bigquery.SchemaField(name="type", field_type="STRING", mode="NULLABLE", description="Type of media"),
                        bigquery.SchemaField(name="variants", field_type="RECORD", mode="REPEATED", description="Variants of the media", fields=[
                            bigquery.SchemaField(name="bitrate", field_type="INTEGER", mode="NULLABLE", description="Bitrate of the media variant"),
                            bigquery.SchemaField(name="contentType", field_type="STRING", mode="NULLABLE", description="Content type of the media variant"),
                            bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the media variant")
                        ])
                    ]),
                    bigquery.SchemaField(name="outlinks", field_type="STRING", mode="REPEATED", description="Outlinks in the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="quoteCount", field_type="INTEGER", mode="NULLABLE", description="Number of quotes of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="replyCount", field_type="INTEGER", mode="NULLABLE", description="Number of replies of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="retweetCount", field_type="INTEGER", mode="NULLABLE", description="Number of retweets of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="source", field_type="STRING", mode="NULLABLE", description="Source of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="sourceLabel", field_type="STRING", mode="NULLABLE", description="Source label of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="sourceUrl", field_type="STRING", mode="NULLABLE", description="Source URL of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="tcooutlinks", field_type="STRING", mode="REPEATED", description="t.co outlinks in the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="retweetedTweet", field_type="RECORD", mode="NULLABLE", description="Retweeted tweet within the quoted tweet within the quoted tweet", fields=[
                        bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the retweeted tweet within the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the retweeted tweet within the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="user", field_type="RECORD", mode="NULLABLE", description="User who posted the retweeted tweet within the quoted tweet within the quoted tweet", fields=[
                            bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the user who posted the retweeted tweet within the quoted tweet within the quoted tweet"),
                            bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the user who posted the retweeted tweet within the quoted tweet within the quoted tweet"),
                            bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the user who posted the retweeted tweet within the quoted tweet within the quoted tweet"),
                        ]),
                        bigquery.SchemaField(name="date", field_type="TIMESTAMP", mode="NULLABLE", description="Date and time of the retweeted tweet within the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="content", field_type="STRING", mode="NULLABLE", description="Content of the retweeted tweet within the quoted tweet within the quoted tweet"),
                    ])
                ])
            ]),
        ])
    ]),
    bigquery.SchemaField(name="mentionedUsers", field_type="RECORD", mode="REPEATED", description="Users mentioned in the tweet", fields=[
        bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the mentioned user"),
        bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the mentioned user"),
        bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the mentioned user"),
        bigquery.SchemaField(name="description", field_type="STRING", mode="NULLABLE", description="Description of the mentioned user"),
        bigquery.SchemaField(name="rawDescription", field_type="STRING", mode="NULLABLE", description="Raw description of the mentioned user"),
        bigquery.SchemaField(name="descriptionUrls", field_type="RECORD", mode="REPEATED", description="URLs in the description of the mentioned user", fields=[
            bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL in the description"),
            bigquery.SchemaField(name="text", field_type="STRING", mode="NULLABLE", description="Text in the description"),
            bigquery.SchemaField(name="indices", field_type="INTEGER", mode="REPEATED", description="Indices in the description"),
            bigquery.SchemaField(name="tcourl", field_type="STRING", mode="NULLABLE", description="t.co URL in the description")
        ]),
        bigquery.SchemaField(name="verified", field_type="BOOLEAN", mode="NULLABLE", description="Whether the mentioned user is verified"),
        bigquery.SchemaField(name="created", field_type="TIMESTAMP", mode="NULLABLE", description="Account creation date of the mentioned user"),
        bigquery.SchemaField(name="followersCount", field_type="INTEGER", mode="NULLABLE", description="Number of followers of the mentioned user"),
        bigquery.SchemaField(name="friendsCount", field_type="INTEGER", mode="NULLABLE", description="Number of friends of the mentioned user"),
        bigquery.SchemaField(name="statusesCount", field_type="INTEGER", mode="NULLABLE", description="Number of statuses of the mentioned user"),
        bigquery.SchemaField(name="favouritesCount", field_type="INTEGER", mode="NULLABLE", description="Number of favourites of the mentioned user"),
        bigquery.SchemaField(name="listedCount", field_type="INTEGER", mode="NULLABLE", description="Number of times the mentioned user is listed"),
        bigquery.SchemaField(name="mediaCount", field_type="INTEGER", mode="NULLABLE", description="Number of media uploaded by the mentioned user"),
        bigquery.SchemaField(name="location", field_type="STRING", mode="NULLABLE", description="Location of the mentioned user"),
        bigquery.SchemaField(name="protected", field_type="BOOLEAN", mode="NULLABLE", description="Whether the mentioned user's tweets are protected"),
        bigquery.SchemaField(name="linkUrl", field_type="STRING", mode="NULLABLE", description="Link URL of the mentioned user"),
        bigquery.SchemaField(name="linkTcourl", field_type="STRING", mode="NULLABLE", description="Link t.co URL of the mentioned user"),
        bigquery.SchemaField(name="profileImageUrl", field_type="STRING", mode="NULLABLE", description="Profile image URL of the mentioned user"),
        bigquery.SchemaField(name="profileBannerUrl", field_type="STRING", mode="NULLABLE", description="Profile banner URL of the mentioned user"),
        bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the mentioned user")
    ])
]
```
##### Proceso de creación de tabla en bigquery
```python
def load_data_from_gcs_to_bigquery(uri, table_id):
    # Inicializa el cliente de BigQuery
    client = bigquery.Client()

    job_config = bigquery.LoadJobConfig(
        schema=schema,
        source_format=bigquery.SourceFormat.NEWLINE_DELIMITED_JSON,
        max_bad_records=0,  # No permitir registros malos antes de fallar
        time_partitioning=bigquery.TimePartitioning(
            type_=bigquery.TimePartitioningType.DAY,
            field="date"  # Campo de partición
        )
    )

    load_job = client.load_table_from_uri(
        uri, table_id, job_config=job_config
    )

    print(f'Starting job {load_job.job_id}')
    load_job.result()
    print(f'Job finished.')

    destination_table = client.get_table(table_id)
    print(f'Loaded {destination_table.num_rows} rows.')

if __name__ == '__main__':
    uri = 'gs://bucket-project-latam-challenge/farmers-protest-tweets-2021-2-4.json'
    table_ref = f"{project_id}.{dataset_id}.{table_id}"

    load_data_from_gcs_to_bigquery(uri, table_ref)
```
##### Análisis
```python
from google.cloud import bigquery
from typing import List, Tuple
import datetime

def get_bigquery_client():
    return bigquery.Client()

def run_bigquery_query(query: str):
    client = get_bigquery_client()
    query_job = client.query(query)
    results = query_job.result()
    return results

def q1_time(file_path: str) -> List[Tuple[datetime.date, str]]:
    query = """
    WITH TEMP_DATA_001 AS 
    (
      SELECT DATE(date) AS fecha,
        id,
        user.username AS username
      FROM `project-latam-challenge.twitter_data.farmers_protest_tweets_2021` 
      WHERE DATE(date) IS NOT NULL
    ), TEMP_DATE_001 AS 
    (
      SELECT fecha,
        COUNT(DISTINCT id) AS tweet_qty
      FROM TEMP_DATA_001
      GROUP BY fecha
    ), TEMP_DATE_002 AS 
    (
      SELECT A.*,
        RANK() OVER(ORDER BY tweet_qty DESC, RAND()) AS ranking_tweet
      FROM TEMP_DATE_001 A
    ), TEMP_USER_001 AS 
    (
      SELECT fecha,
        username,
        COUNT(DISTINCT id) AS tweet_qty
      FROM TEMP_DATA_001
      GROUP BY fecha, username
    ), TEMP_USER_002 AS 
    (
      SELECT A.*,
        RANK() OVER(PARTITION BY fecha ORDER BY tweet_qty DESC, RAND()) AS ranking_user
      FROM TEMP_USER_001 A
    ), TEMP_USER_003 AS 
    (
      SELECT A.*
      FROM TEMP_USER_002 A
      WHERE ranking_user=1
    )
    SELECT A.fecha,
      B.username
    FROM TEMP_DATE_002 A
    LEFT JOIN TEMP_USER_003 B
    ON A.fecha=B.fecha 
    WHERE A.ranking_tweet<=10
    ORDER BY ranking_tweet
    """
    
    results = run_bigquery_query(query)
    return [(row.fecha, row.username) for row in results]

if __name__ == "__main__":
    file_path = '../data/farmers_protest_tweets_2021.json'
    top_10_dates = q1_time(file_path)
    for date, user in top_10_dates:
        print(f"Fecha: {date}, Usuario: {user}")
```

# ETL

Se crea un proceso de Extracción, transformación y carga del archivo, lo que permitirá procesar el archivo directamente en Bigquery, parte de Google Cloud Bigquery (GCP), esto permitirá optimizar la velocidad de cálculo y la utilización de SQL para desarrollar el análisis.

### Credenciales

Este desarrollo se realizará utilizando Google Cloud, por lo que se crea un proyecto en GCP llamado "project-latam-challenge". En este proyecto se crea una service account llamada "sa-etl-latam-challenge" que será utilizada para realizar la carga de datos.

In [1]:
import os

# Ruta a archivo de credenciales JSON de Google Cloud
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = "../credentials/project-latam-challenge-749ce1a96052.json"


### Carga en dataframe

Se carga en un dataframe el [archivo](https://drive.google.com/file/d/1ig2ngoXFTxP5Pa8muXo02mDTFexZzsis/view?usp=sharing) json declarado como parte del challenge.

In [3]:
import pandas as pd

# Leer el archivo CSV en un DataFrame
df = pd.read_json(file_path,lines=True)

# Mostrar las primeras filas del DataFrame
df.head()

Unnamed: 0,url,date,content,renderedContent,id,user,outlinks,tcooutlinks,replyCount,retweetCount,...,quoteCount,conversationId,lang,source,sourceUrl,sourceLabel,media,retweetedTweet,quotedTweet,mentionedUsers
0,https://twitter.com/ArjunSinghPanam/status/136...,2021-02-24 09:23:35+00:00,The world progresses while the Indian police a...,The world progresses while the Indian police a...,1364506249291784198,"{'username': 'ArjunSinghPanam', 'displayname':...",[https://twitter.com/ravisinghka/status/136415...,[https://t.co/es3kn0IQAF],0,0,...,0,1364506249291784198,en,"<a href=""http://twitter.com/download/iphone"" r...",http://twitter.com/download/iphone,Twitter for iPhone,,,{'url': 'https://twitter.com/RaviSinghKA/statu...,"[{'username': 'narendramodi', 'displayname': '..."
1,https://twitter.com/PrdeepNain/status/13645062...,2021-02-24 09:23:32+00:00,#FarmersProtest \n#ModiIgnoringFarmersDeaths \...,#FarmersProtest \n#ModiIgnoringFarmersDeaths \...,1364506237451313155,"{'username': 'PrdeepNain', 'displayname': 'Pra...",[],[],0,0,...,0,1364506237451313155,en,"<a href=""http://twitter.com/download/android"" ...",http://twitter.com/download/android,Twitter for Android,[{'thumbnailUrl': 'https://pbs.twimg.com/ext_t...,,,"[{'username': 'Kisanektamorcha', 'displayname'..."
2,https://twitter.com/parmarmaninder/status/1364...,2021-02-24 09:23:22+00:00,ਪੈਟਰੋਲ ਦੀਆਂ ਕੀਮਤਾਂ ਨੂੰ ਮੱਦੇਨਜ਼ਰ ਰੱਖਦੇ ਹੋਏ \nਮੇ...,ਪੈਟਰੋਲ ਦੀਆਂ ਕੀਮਤਾਂ ਨੂੰ ਮੱਦੇਨਜ਼ਰ ਰੱਖਦੇ ਹੋਏ \nਮੇ...,1364506195453767680,"{'username': 'parmarmaninder', 'displayname': ...",[],[],0,0,...,0,1364506195453767680,pa,"<a href=""http://twitter.com/download/android"" ...",http://twitter.com/download/android,Twitter for Android,,,,
3,https://twitter.com/anmoldhaliwal/status/13645...,2021-02-24 09:23:16+00:00,@ReallySwara @rohini_sgh watch full video here...,@ReallySwara @rohini_sgh watch full video here...,1364506167226032128,"{'username': 'anmoldhaliwal', 'displayname': '...",[https://youtu.be/-bUKumwq-J8],[https://t.co/wBPNdJdB0n],0,0,...,0,1364350947099484160,en,"<a href=""https://mobile.twitter.com"" rel=""nofo...",https://mobile.twitter.com,Twitter Web App,[{'thumbnailUrl': 'https://pbs.twimg.com/ext_t...,,,"[{'username': 'ReallySwara', 'displayname': 'S..."
4,https://twitter.com/KotiaPreet/status/13645061...,2021-02-24 09:23:10+00:00,#KisanEktaMorcha #FarmersProtest #NoFarmersNoF...,#KisanEktaMorcha #FarmersProtest #NoFarmersNoF...,1364506144002088963,"{'username': 'KotiaPreet', 'displayname': 'Pre...",[],[],0,0,...,0,1364506144002088963,und,"<a href=""http://twitter.com/download/iphone"" r...",http://twitter.com/download/iphone,Twitter for iPhone,[{'previewUrl': 'https://pbs.twimg.com/media/E...,,,


## Carga dataframe en GCP

### Configuración de carga

Usando Google Cloud SDK se crea un bucket llamado 'bucket-project-latam-challenge' en el proyecto 'project-latam-challenge' y se carga el archivo directamente a dicho bucket.
gcloud auth login
gcloud config set project project-latam-challenge

gsutil mb gs://bucket-project-latam-challenge/

gsutil cp farmers-protest-tweets-2021-2-4.json gs://bucket-project-latam-challenge/

In [4]:
from google.cloud import bigquery

# Parámetros
project_id = 'project-latam-challenge'
dataset_id = 'twitter_data'
table_id = 'farmers_protest_tweets_2021'


Se declara explicitamente la estructura del esquema con la data correspondiente, esto permitirá evitar errores en los tipos de dato al cargar la data en Bigquery.

In [5]:
schema = [
    bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the tweet"),
    bigquery.SchemaField(name="date", field_type="TIMESTAMP", mode="NULLABLE", description="Date and time of the tweet"),
    bigquery.SchemaField(name="content", field_type="STRING", mode="NULLABLE", description="Text content of the tweet"),
    bigquery.SchemaField(name="renderedContent", field_type="STRING", mode="NULLABLE", description="Rendered text content of the tweet"),
    bigquery.SchemaField(name="id", field_type="INTEGER", mode="REQUIRED", description="Unique identifier for the Tweet"),
    bigquery.SchemaField(name="user", field_type="RECORD", mode="NULLABLE", description="The user who posted this Tweet", fields=[
        bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the user"),
        bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the user"),
        bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="Unique identifier for the user"),
        bigquery.SchemaField(name="description", field_type="STRING", mode="NULLABLE", description="User's description"),
        bigquery.SchemaField(name="rawDescription", field_type="STRING", mode="NULLABLE", description="Raw description of the user"),
        bigquery.SchemaField(name="descriptionUrls", field_type="RECORD", mode="REPEATED", description="URLs in the user's description", fields=[
            bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL in the description"),
            bigquery.SchemaField(name="text", field_type="STRING", mode="NULLABLE", description="Text in the description"),
            bigquery.SchemaField(name="indices", field_type="INTEGER", mode="REPEATED", description="Indices in the description"),
            bigquery.SchemaField(name="tcourl", field_type="STRING", mode="NULLABLE", description="t.co URL in the description")
        ]),
        bigquery.SchemaField(name="verified", field_type="BOOLEAN", mode="NULLABLE", description="Whether the user is verified"),
        bigquery.SchemaField(name="created", field_type="TIMESTAMP", mode="NULLABLE", description="Account creation date"),
        bigquery.SchemaField(name="followersCount", field_type="INTEGER", mode="NULLABLE", description="Number of followers"),
        bigquery.SchemaField(name="friendsCount", field_type="INTEGER", mode="NULLABLE", description="Number of friends"),
        bigquery.SchemaField(name="statusesCount", field_type="INTEGER", mode="NULLABLE", description="Number of statuses"),
        bigquery.SchemaField(name="favouritesCount", field_type="INTEGER", mode="NULLABLE", description="Number of favourites"),
        bigquery.SchemaField(name="listedCount", field_type="INTEGER", mode="NULLABLE", description="Number of times listed"),
        bigquery.SchemaField(name="mediaCount", field_type="INTEGER", mode="NULLABLE", description="Number of media uploaded"),
        bigquery.SchemaField(name="location", field_type="STRING", mode="NULLABLE", description="Location of the user"),
        bigquery.SchemaField(name="protected", field_type="BOOLEAN", mode="NULLABLE", description="Whether the user's tweets are protected"),
        bigquery.SchemaField(name="linkUrl", field_type="STRING", mode="NULLABLE", description="User's link URL"),
        bigquery.SchemaField(name="linkTcourl", field_type="STRING", mode="NULLABLE", description="User's link t.co URL"),
        bigquery.SchemaField(name="profileImageUrl", field_type="STRING", mode="NULLABLE", description="Profile image URL"),
        bigquery.SchemaField(name="profileBannerUrl", field_type="STRING", mode="NULLABLE", description="Profile banner URL"),
        bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="User's URL")
    ]),
    bigquery.SchemaField(name="outlinks", field_type="STRING", mode="REPEATED", description="Outlinks in the tweet"),
    bigquery.SchemaField(name="tcooutlinks", field_type="STRING", mode="REPEATED", description="t.co outlinks in the tweet"),
    bigquery.SchemaField(name="replyCount", field_type="INTEGER", mode="NULLABLE", description="Number of replies"),
    bigquery.SchemaField(name="retweetCount", field_type="INTEGER", mode="NULLABLE", description="Number of retweets"),
    bigquery.SchemaField(name="likeCount", field_type="INTEGER", mode="NULLABLE", description="Number of likes"),
    bigquery.SchemaField(name="quoteCount", field_type="INTEGER", mode="NULLABLE", description="Number of quotes"),
    bigquery.SchemaField(name="conversationId", field_type="INTEGER", mode="NULLABLE", description="Conversation ID"),
    bigquery.SchemaField(name="lang", field_type="STRING", mode="NULLABLE", description="Language of the tweet"),
    bigquery.SchemaField(name="source", field_type="STRING", mode="NULLABLE", description="Source of the tweet"),
    bigquery.SchemaField(name="sourceUrl", field_type="STRING", mode="NULLABLE", description="Source URL"),
    bigquery.SchemaField(name="sourceLabel", field_type="STRING", mode="NULLABLE", description="Source label"),
    bigquery.SchemaField(name="media", field_type="RECORD", mode="REPEATED", description="Media attached to the tweet", fields=[
        bigquery.SchemaField(name="duration", field_type="FLOAT", mode="NULLABLE", description="Duration of the media"),
        bigquery.SchemaField(name="fullUrl", field_type="STRING", mode="NULLABLE", description="Full URL of the media"),
        bigquery.SchemaField(name="previewUrl", field_type="STRING", mode="NULLABLE", description="Preview URL of the media"),
        bigquery.SchemaField(name="thumbnailUrl", field_type="STRING", mode="NULLABLE", description="Thumbnail URL of the media"),
        bigquery.SchemaField(name="type", field_type="STRING", mode="NULLABLE", description="Type of media"),
        bigquery.SchemaField(name="variants", field_type="RECORD", mode="REPEATED", description="Variants of the media", fields=[
            bigquery.SchemaField(name="bitrate", field_type="INTEGER", mode="NULLABLE", description="Bitrate of the media variant"),
            bigquery.SchemaField(name="contentType", field_type="STRING", mode="NULLABLE", description="Content type of the media variant"),
            bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the media variant")
        ])
    ]),
    bigquery.SchemaField(name="retweetedTweet", field_type="RECORD", mode="NULLABLE", description="Retweeted tweet", fields=[
        bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the retweeted tweet"),
        bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the retweeted tweet"),
        bigquery.SchemaField(name="user", field_type="RECORD", mode="NULLABLE", description="User who posted the retweeted tweet", fields=[
            bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username user who posted the retweeted tweet"),
            bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name user who posted the retweeted tweet"),
            bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="Unique identifier for the user who posted the retweeted tweet"),
            bigquery.SchemaField(name="description", field_type="STRING", mode="NULLABLE", description="User's who posted the retweeted tweet description"),
            bigquery.SchemaField(name="rawDescription", field_type="STRING", mode="NULLABLE", description="Raw description user who posted the retweeted tweet"),
            bigquery.SchemaField(name="descriptionUrls", field_type="RECORD", mode="REPEATED", description="URLs in the user's description", fields=[
                bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL in the description"),
                bigquery.SchemaField(name="text", field_type="STRING", mode="NULLABLE", description="Text in the description"),
                bigquery.SchemaField(name="indices", field_type="INTEGER", mode="REPEATED", description="Indices in the description"),
                bigquery.SchemaField(name="tcourl", field_type="STRING", mode="NULLABLE", description="t.co URL in the description")
            ]),
            bigquery.SchemaField(name="verified", field_type="BOOLEAN", mode="NULLABLE", description="Whether the user who retweeted tweet description is verified"),
            bigquery.SchemaField(name="created", field_type="TIMESTAMP", mode="NULLABLE", description="Account creation date of the user who posted the retweeted tweet"),
            bigquery.SchemaField(name="followersCount", field_type="INTEGER", mode="NULLABLE", description="Number of followers of the user who posted the retweeted tweet"),
            bigquery.SchemaField(name="friendsCount", field_type="INTEGER", mode="NULLABLE", description="Number of friends of the user who posted the retweeted tweet"),
            bigquery.SchemaField(name="statusesCount", field_type="INTEGER", mode="NULLABLE", description="Number of statuses of the user who posted the retweeted tweet"),
            bigquery.SchemaField(name="favouritesCount", field_type="INTEGER", mode="NULLABLE", description="Number of favourites of the user who posted the retweeted tweet"),
            bigquery.SchemaField(name="listedCount", field_type="INTEGER", mode="NULLABLE", description="Number of times listed of the user who posted the retweeted tweet"),
            bigquery.SchemaField(name="mediaCount", field_type="INTEGER", mode="NULLABLE", description="Number of media uploaded of the user who posted the retweeted tweet"),
            bigquery.SchemaField(name="location", field_type="STRING", mode="NULLABLE", description="Location user who posted the retweeted tweet"),
            bigquery.SchemaField(name="protected", field_type="BOOLEAN", mode="NULLABLE", description="Whether the user's who posted the retweeted tweet tweets are protected"),
            bigquery.SchemaField(name="linkUrl", field_type="STRING", mode="NULLABLE", description="User's who posted the retweeted tweet link URL"),
            bigquery.SchemaField(name="linkTcourl", field_type="STRING", mode="NULLABLE", description="User's link t.co URL"),
            bigquery.SchemaField(name="profileImageUrl", field_type="STRING", mode="NULLABLE", description="Profile image URL"),
            bigquery.SchemaField(name="profileBannerUrl", field_type="STRING", mode="NULLABLE", description="Profile banner URL"),
            bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="User's URL")
        ]),
        bigquery.SchemaField(name="date", field_type="TIMESTAMP", mode="NULLABLE", description="Date and time of the retweeted tweet"),
        bigquery.SchemaField(name="content", field_type="STRING", mode="NULLABLE", description="Content of the retweeted tweet"),
    ]),
    bigquery.SchemaField(name="quotedTweet", field_type="RECORD", mode="NULLABLE", description="Quoted tweet", fields=[
        bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the quoted tweet"),
        bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the quoted tweet"),
        bigquery.SchemaField(name="user", field_type="RECORD", mode="NULLABLE", description="User who posted the quoted tweet", fields=[
            bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="description", field_type="STRING", mode="NULLABLE", description="Description of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="rawDescription", field_type="STRING", mode="NULLABLE", description="Raw description of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="descriptionUrls", field_type="RECORD", mode="REPEATED", description="URLs in the user's description", fields=[
                bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL in the description"),
                bigquery.SchemaField(name="text", field_type="STRING", mode="NULLABLE", description="Text in the description"),
                bigquery.SchemaField(name="indices", field_type="INTEGER", mode="REPEATED", description="Indices in the description"),
                bigquery.SchemaField(name="tcourl", field_type="STRING", mode="NULLABLE", description="t.co URL in the description")
            ]),
            bigquery.SchemaField(name="verified", field_type="BOOLEAN", mode="NULLABLE", description="Whether the user who posted the quoted tweet is verified"),
            bigquery.SchemaField(name="created", field_type="TIMESTAMP", mode="NULLABLE", description="Account creation date of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="followersCount", field_type="INTEGER", mode="NULLABLE", description="Number of followers of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="friendsCount", field_type="INTEGER", mode="NULLABLE", description="Number of friends of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="statusesCount", field_type="INTEGER", mode="NULLABLE", description="Number of statuses of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="favouritesCount", field_type="INTEGER", mode="NULLABLE", description="Number of favourites of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="listedCount", field_type="INTEGER", mode="NULLABLE", description="Number of times listed of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="mediaCount", field_type="INTEGER", mode="NULLABLE", description="Number of media uploaded of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="location", field_type="STRING", mode="NULLABLE", description="Location of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="protected", field_type="BOOLEAN", mode="NULLABLE", description="Whether the user's who posted the quoted tweet tweets are protected"),
            bigquery.SchemaField(name="linkUrl", field_type="STRING", mode="NULLABLE", description="User's link URL of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="linkTcourl", field_type="STRING", mode="NULLABLE", description="User's link t.co URL of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="profileImageUrl", field_type="STRING", mode="NULLABLE", description="Profile image URL of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="profileBannerUrl", field_type="STRING", mode="NULLABLE", description="Profile banner URL of the user who posted the quoted tweet"),
            bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the user who posted the quoted tweet")
        ]),
        bigquery.SchemaField(name="date", field_type="TIMESTAMP", mode="NULLABLE", description="Date and time of the quoted tweet"),
        bigquery.SchemaField(name="content", field_type="STRING", mode="NULLABLE", description="Content of the quoted tweet"),
        bigquery.SchemaField(name="renderedContent", field_type="STRING", mode="NULLABLE", description="Rendered content of the quoted tweet"),
        bigquery.SchemaField(name="conversationId", field_type="INTEGER", mode="NULLABLE", description="Conversation ID of the quoted tweet"),
        bigquery.SchemaField(name="lang", field_type="STRING", mode="NULLABLE", description="Language of the quoted tweet"),
        bigquery.SchemaField(name="likeCount", field_type="INTEGER", mode="NULLABLE", description="Number of likes of the quoted tweet"),
        bigquery.SchemaField(name="media", field_type="RECORD", mode="REPEATED", description="Media attached to the quoted tweet", fields=[
            bigquery.SchemaField(name="duration", field_type="FLOAT", mode="NULLABLE", description="Duration of the media"),
            bigquery.SchemaField(name="fullUrl", field_type="STRING", mode="NULLABLE", description="Full URL of the media"),
            bigquery.SchemaField(name="previewUrl", field_type="STRING", mode="NULLABLE", description="Preview URL of the media"),
            bigquery.SchemaField(name="thumbnailUrl", field_type="STRING", mode="NULLABLE", description="Thumbnail URL of the media"),
            bigquery.SchemaField(name="type", field_type="STRING", mode="NULLABLE", description="Type of media"),
            bigquery.SchemaField(name="variants", field_type="RECORD", mode="REPEATED", description="Variants of the media", fields=[
                bigquery.SchemaField(name="bitrate", field_type="INTEGER", mode="NULLABLE", description="Bitrate of the media variant"),
                bigquery.SchemaField(name="contentType", field_type="STRING", mode="NULLABLE", description="Content type of the media variant"),
                bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the media variant")
            ])
        ]),
        bigquery.SchemaField(name="outlinks", field_type="STRING", mode="REPEATED", description="Outlinks in the quoted tweet"),
        bigquery.SchemaField(name="quoteCount", field_type="INTEGER", mode="NULLABLE", description="Number of quotes of the quoted tweet"),
        bigquery.SchemaField(name="replyCount", field_type="INTEGER", mode="NULLABLE", description="Number of replies of the quoted tweet"),
        bigquery.SchemaField(name="retweetCount", field_type="INTEGER", mode="NULLABLE", description="Number of retweets of the quoted tweet"),
        bigquery.SchemaField(name="source", field_type="STRING", mode="NULLABLE", description="Source of the quoted tweet"),
        bigquery.SchemaField(name="sourceLabel", field_type="STRING", mode="NULLABLE", description="Source label of the quoted tweet"),
        bigquery.SchemaField(name="sourceUrl", field_type="STRING", mode="NULLABLE", description="Source URL of the quoted tweet"),
        bigquery.SchemaField(name="tcooutlinks", field_type="STRING", mode="REPEATED", description="t.co outlinks in the quoted tweet"),
        bigquery.SchemaField(name="mentionedUsers", field_type="RECORD", mode="REPEATED", description="Users mentioned in the quoted tweet", fields=[
            bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the mentioned user"),
            bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the mentioned user"),
            bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the mentioned user"),
            bigquery.SchemaField(name="description", field_type="STRING", mode="NULLABLE", description="Description of the mentioned user"),
            bigquery.SchemaField(name="rawDescription", field_type="STRING", mode="NULLABLE", description="Raw description of the mentioned user"),
            bigquery.SchemaField(name="descriptionUrls", field_type="RECORD", mode="REPEATED", description="URLs in the description of the mentioned user", fields=[
                bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL in the description"),
                bigquery.SchemaField(name="text", field_type="STRING", mode="NULLABLE", description="Text in the description"),
                bigquery.SchemaField(name="indices", field_type="INTEGER", mode="REPEATED", description="Indices in the description"),
                bigquery.SchemaField(name="tcourl", field_type="STRING", mode="NULLABLE", description="t.co URL in the description")
            ]),
            bigquery.SchemaField(name="verified", field_type="BOOLEAN", mode="NULLABLE", description="Whether the mentioned user is verified"),
            bigquery.SchemaField(name="created", field_type="TIMESTAMP", mode="NULLABLE", description="Account creation date of the mentioned user"),
            bigquery.SchemaField(name="followersCount", field_type="INTEGER", mode="NULLABLE", description="Number of followers of the mentioned user"),
            bigquery.SchemaField(name="friendsCount", field_type="INTEGER", mode="NULLABLE", description="Number of friends of the mentioned user"),
            bigquery.SchemaField(name="statusesCount", field_type="INTEGER", mode="NULLABLE", description="Number of statuses of the mentioned user"),
            bigquery.SchemaField(name="favouritesCount", field_type="INTEGER", mode="NULLABLE", description="Number of favourites of the mentioned user"),
            bigquery.SchemaField(name="listedCount", field_type="INTEGER", mode="NULLABLE", description="Number of times the mentioned user is listed"),
            bigquery.SchemaField(name="mediaCount", field_type="INTEGER", mode="NULLABLE", description="Number of media uploaded by the mentioned user"),
            bigquery.SchemaField(name="location", field_type="STRING", mode="NULLABLE", description="Location of the mentioned user"),
            bigquery.SchemaField(name="protected", field_type="BOOLEAN", mode="NULLABLE", description="Whether the mentioned user's tweets are protected"),
            bigquery.SchemaField(name="linkUrl", field_type="STRING", mode="NULLABLE", description="Link URL of the mentioned user"),
            bigquery.SchemaField(name="linkTcourl", field_type="STRING", mode="NULLABLE", description="Link t.co URL of the mentioned user"),
            bigquery.SchemaField(name="profileImageUrl", field_type="STRING", mode="NULLABLE", description="Profile image URL of the mentioned user"),
            bigquery.SchemaField(name="profileBannerUrl", field_type="STRING", mode="NULLABLE", description="Profile banner URL of the mentioned user"),
            bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the mentioned user")
        ]),
        bigquery.SchemaField(name="retweetedTweet", field_type="RECORD", mode="NULLABLE", description="Retweeted tweet within the quoted tweet", fields=[
            bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the retweeted tweet within the quoted tweet"),
            bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the retweeted tweet within the quoted tweet"),
            bigquery.SchemaField(name="user", field_type="RECORD", mode="NULLABLE", description="User who posted the retweeted tweet within the quoted tweet", fields=[
                bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the user who posted the retweeted tweet within the quoted tweet"),
                bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the user who posted the retweeted tweet within the quoted tweet"),
                bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the user who posted the retweeted tweet within the quoted tweet"),
            ]),
            bigquery.SchemaField(name="date", field_type="TIMESTAMP", mode="NULLABLE", description="Date and time of the retweeted tweet within the quoted tweet"),
            bigquery.SchemaField(name="content", field_type="STRING", mode="NULLABLE", description="Content of the retweeted tweet within the quoted tweet"),
                    ]),
        bigquery.SchemaField(name="quotedTweet", field_type="RECORD", mode="NULLABLE", description="Quoted tweet within the quoted tweet", fields=[
            bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="user", field_type="RECORD", mode="NULLABLE", description="User who posted the quoted tweet within the quoted tweet", fields=[
                bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="description", field_type="STRING", mode="NULLABLE", description="Description of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="rawDescription", field_type="STRING", mode="NULLABLE", description="Raw description of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="descriptionUrls", field_type="RECORD", mode="REPEATED", description="URLs in the user's description", fields=[
                    bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL in the description"),
                    bigquery.SchemaField(name="text", field_type="STRING", mode="NULLABLE", description="Text in the description"),
                    bigquery.SchemaField(name="indices", field_type="INTEGER", mode="REPEATED", description="Indices in the description"),
                    bigquery.SchemaField(name="tcourl", field_type="STRING", mode="NULLABLE", description="t.co URL in the description")
                ]),
                bigquery.SchemaField(name="verified", field_type="BOOLEAN", mode="NULLABLE", description="Whether the user who posted the quoted tweet within the quoted tweet is verified"),
                bigquery.SchemaField(name="created", field_type="TIMESTAMP", mode="NULLABLE", description="Account creation date of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="followersCount", field_type="INTEGER", mode="NULLABLE", description="Number of followers of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="friendsCount", field_type="INTEGER", mode="NULLABLE", description="Number of friends of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="statusesCount", field_type="INTEGER", mode="NULLABLE", description="Number of statuses of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="favouritesCount", field_type="INTEGER", mode="NULLABLE", description="Number of favourites of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="listedCount", field_type="INTEGER", mode="NULLABLE", description="Number of times listed of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="mediaCount", field_type="INTEGER", mode="NULLABLE", description="Number of media uploaded of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="location", field_type="STRING", mode="NULLABLE", description="Location of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="protected", field_type="BOOLEAN", mode="NULLABLE", description="Whether the user's who posted the quoted tweet within the quoted tweet tweets are protected"),
                bigquery.SchemaField(name="linkUrl", field_type="STRING", mode="NULLABLE", description="User's link URL of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="linkTcourl", field_type="STRING", mode="NULLABLE", description="User's link t.co URL of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="profileImageUrl", field_type="STRING", mode="NULLABLE", description="Profile image URL of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="profileBannerUrl", field_type="STRING", mode="NULLABLE", description="Profile banner URL of the user who posted the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the user who posted the quoted tweet within the quoted tweet")
            ]),
            bigquery.SchemaField(name="date", field_type="TIMESTAMP", mode="NULLABLE", description="Date and time of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="content", field_type="STRING", mode="NULLABLE", description="Content of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="renderedContent", field_type="STRING", mode="NULLABLE", description="Rendered content of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="conversationId", field_type="INTEGER", mode="NULLABLE", description="Conversation ID of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="lang", field_type="STRING", mode="NULLABLE", description="Language of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="likeCount", field_type="INTEGER", mode="NULLABLE", description="Number of likes of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="media", field_type="RECORD", mode="REPEATED", description="Media attached to the quoted tweet within the quoted tweet", fields=[
                bigquery.SchemaField(name="duration", field_type="FLOAT", mode="NULLABLE", description="Duration of the media"),
                bigquery.SchemaField(name="fullUrl", field_type="STRING", mode="NULLABLE", description="Full URL of the media"),
                bigquery.SchemaField(name="previewUrl", field_type="STRING", mode="NULLABLE", description="Preview URL of the media"),
                bigquery.SchemaField(name="thumbnailUrl", field_type="STRING", mode="NULLABLE", description="Thumbnail URL of the media"),
                bigquery.SchemaField(name="type", field_type="STRING", mode="NULLABLE", description="Type of media"),
                bigquery.SchemaField(name="variants", field_type="RECORD", mode="REPEATED", description="Variants of the media", fields=[
                    bigquery.SchemaField(name="bitrate", field_type="INTEGER", mode="NULLABLE", description="Bitrate of the media variant"),
                    bigquery.SchemaField(name="contentType", field_type="STRING", mode="NULLABLE", description="Content type of the media variant"),
                    bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the media variant")
                ])
            ]),
            bigquery.SchemaField(name="outlinks", field_type="STRING", mode="REPEATED", description="Outlinks in the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="quoteCount", field_type="INTEGER", mode="NULLABLE", description="Number of quotes of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="replyCount", field_type="INTEGER", mode="NULLABLE", description="Number of replies of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="retweetCount", field_type="INTEGER", mode="NULLABLE", description="Number of retweets of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="source", field_type="STRING", mode="NULLABLE", description="Source of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="sourceLabel", field_type="STRING", mode="NULLABLE", description="Source label of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="sourceUrl", field_type="STRING", mode="NULLABLE", description="Source URL of the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="tcooutlinks", field_type="STRING", mode="REPEATED", description="t.co outlinks in the quoted tweet within the quoted tweet"),
            bigquery.SchemaField(name="mentionedUsers", field_type="RECORD", mode="REPEATED", description="Users mentioned in the quoted tweet", fields=[
                bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the mentioned user"),
                bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the mentioned user"),
                bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the mentioned user"),
                bigquery.SchemaField(name="description", field_type="STRING", mode="NULLABLE", description="Description of the mentioned user"),
                bigquery.SchemaField(name="rawDescription", field_type="STRING", mode="NULLABLE", description="Raw description of the mentioned user"),
                bigquery.SchemaField(name="descriptionUrls", field_type="RECORD", mode="REPEATED", description="URLs in the description of the mentioned user", fields=[
                    bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL in the description"),
                    bigquery.SchemaField(name="text", field_type="STRING", mode="NULLABLE", description="Text in the description"),
                    bigquery.SchemaField(name="indices", field_type="INTEGER", mode="REPEATED", description="Indices in the description"),
                    bigquery.SchemaField(name="tcourl", field_type="STRING", mode="NULLABLE", description="t.co URL in the description")
                ]),
                bigquery.SchemaField(name="verified", field_type="BOOLEAN", mode="NULLABLE", description="Whether the mentioned user is verified"),
                bigquery.SchemaField(name="created", field_type="TIMESTAMP", mode="NULLABLE", description="Account creation date of the mentioned user"),
                bigquery.SchemaField(name="followersCount", field_type="INTEGER", mode="NULLABLE", description="Number of followers of the mentioned user"),
                bigquery.SchemaField(name="friendsCount", field_type="INTEGER", mode="NULLABLE", description="Number of friends of the mentioned user"),
                bigquery.SchemaField(name="statusesCount", field_type="INTEGER", mode="NULLABLE", description="Number of statuses of the mentioned user"),
                bigquery.SchemaField(name="favouritesCount", field_type="INTEGER", mode="NULLABLE", description="Number of favourites of the mentioned user"),
                bigquery.SchemaField(name="listedCount", field_type="INTEGER", mode="NULLABLE", description="Number of times the mentioned user is listed"),
                bigquery.SchemaField(name="mediaCount", field_type="INTEGER", mode="NULLABLE", description="Number of media uploaded by the mentioned user"),
                bigquery.SchemaField(name="location", field_type="STRING", mode="NULLABLE", description="Location of the mentioned user"),
                bigquery.SchemaField(name="protected", field_type="BOOLEAN", mode="NULLABLE", description="Whether the mentioned user's tweets are protected"),
                bigquery.SchemaField(name="linkUrl", field_type="STRING", mode="NULLABLE", description="Link URL of the mentioned user"),
                bigquery.SchemaField(name="linkTcourl", field_type="STRING", mode="NULLABLE", description="Link t.co URL of the mentioned user"),
                bigquery.SchemaField(name="profileImageUrl", field_type="STRING", mode="NULLABLE", description="Profile image URL of the mentioned user"),
                bigquery.SchemaField(name="profileBannerUrl", field_type="STRING", mode="NULLABLE", description="Profile banner URL of the mentioned user"),
                bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the mentioned user")
            ]),
            bigquery.SchemaField(name="retweetedTweet", field_type="RECORD", mode="NULLABLE", description="Retweeted tweet within the quoted tweet within the quoted tweet", fields=[
                bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the retweeted tweet within the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the retweeted tweet within the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="user", field_type="RECORD", mode="NULLABLE", description="User who posted the retweeted tweet within the quoted tweet within the quoted tweet", fields=[
                    bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the user who posted the retweeted tweet within the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the user who posted the retweeted tweet within the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the user who posted the retweeted tweet within the quoted tweet within the quoted tweet"),
                ]),
                bigquery.SchemaField(name="date", field_type="TIMESTAMP", mode="NULLABLE", description="Date and time of the retweeted tweet within the quoted tweet within the quoted tweet"),
                bigquery.SchemaField(name="content", field_type="STRING", mode="NULLABLE", description="Content of the retweeted tweet within the quoted tweet within the quoted tweet"),
            ]),            
            bigquery.SchemaField(name="quotedTweet", field_type="RECORD", mode="NULLABLE", description="Quoted tweet", fields=[
                bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the quoted tweet"),
                bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the quoted tweet"),
                bigquery.SchemaField(name="user", field_type="RECORD", mode="NULLABLE", description="User who posted the quoted tweet", fields=[
                    bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="description", field_type="STRING", mode="NULLABLE", description="Description of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="rawDescription", field_type="STRING", mode="NULLABLE", description="Raw description of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="descriptionUrls", field_type="RECORD", mode="REPEATED", description="URLs in the user's description", fields=[
                        bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL in the description"),
                        bigquery.SchemaField(name="text", field_type="STRING", mode="NULLABLE", description="Text in the description"),
                        bigquery.SchemaField(name="indices", field_type="INTEGER", mode="REPEATED", description="Indices in the description"),
                        bigquery.SchemaField(name="tcourl", field_type="STRING", mode="NULLABLE", description="t.co URL in the description")
                    ]),
                    bigquery.SchemaField(name="verified", field_type="BOOLEAN", mode="NULLABLE", description="Whether the user who posted the quoted tweet is verified"),
                    bigquery.SchemaField(name="created", field_type="TIMESTAMP", mode="NULLABLE", description="Account creation date of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="followersCount", field_type="INTEGER", mode="NULLABLE", description="Number of followers of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="friendsCount", field_type="INTEGER", mode="NULLABLE", description="Number of friends of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="statusesCount", field_type="INTEGER", mode="NULLABLE", description="Number of statuses of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="favouritesCount", field_type="INTEGER", mode="NULLABLE", description="Number of favourites of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="listedCount", field_type="INTEGER", mode="NULLABLE", description="Number of times listed of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="mediaCount", field_type="INTEGER", mode="NULLABLE", description="Number of media uploaded of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="location", field_type="STRING", mode="NULLABLE", description="Location of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="protected", field_type="BOOLEAN", mode="NULLABLE", description="Whether the user's who posted the quoted tweet tweets are protected"),
                    bigquery.SchemaField(name="linkUrl", field_type="STRING", mode="NULLABLE", description="User's link URL of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="linkTcourl", field_type="STRING", mode="NULLABLE", description="User's link t.co URL of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="profileImageUrl", field_type="STRING", mode="NULLABLE", description="Profile image URL of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="profileBannerUrl", field_type="STRING", mode="NULLABLE", description="Profile banner URL of the user who posted the quoted tweet"),
                    bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the user who posted the quoted tweet")
                ]),
                bigquery.SchemaField(name="date", field_type="TIMESTAMP", mode="NULLABLE", description="Date and time of the quoted tweet"),
                bigquery.SchemaField(name="content", field_type="STRING", mode="NULLABLE", description="Content of the quoted tweet"),
                bigquery.SchemaField(name="renderedContent", field_type="STRING", mode="NULLABLE", description="Rendered content of the quoted tweet"),
                bigquery.SchemaField(name="conversationId", field_type="INTEGER", mode="NULLABLE", description="Conversation ID of the quoted tweet"),
                bigquery.SchemaField(name="lang", field_type="STRING", mode="NULLABLE", description="Language of the quoted tweet"),
                bigquery.SchemaField(name="likeCount", field_type="INTEGER", mode="NULLABLE", description="Number of likes of the quoted tweet"),
                bigquery.SchemaField(name="media", field_type="RECORD", mode="REPEATED", description="Media attached to the quoted tweet", fields=[
                    bigquery.SchemaField(name="duration", field_type="FLOAT", mode="NULLABLE", description="Duration of the media"),
                    bigquery.SchemaField(name="fullUrl", field_type="STRING", mode="NULLABLE", description="Full URL of the media"),
                    bigquery.SchemaField(name="previewUrl", field_type="STRING", mode="NULLABLE", description="Preview URL of the media"),
                    bigquery.SchemaField(name="thumbnailUrl", field_type="STRING", mode="NULLABLE", description="Thumbnail URL of the media"),
                    bigquery.SchemaField(name="type", field_type="STRING", mode="NULLABLE", description="Type of media"),
                    bigquery.SchemaField(name="variants", field_type="RECORD", mode="REPEATED", description="Variants of the media", fields=[
                        bigquery.SchemaField(name="bitrate", field_type="INTEGER", mode="NULLABLE", description="Bitrate of the media variant"),
                        bigquery.SchemaField(name="contentType", field_type="STRING", mode="NULLABLE", description="Content type of the media variant"),
                        bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the media variant")
                    ])
                ]),
                bigquery.SchemaField(name="outlinks", field_type="STRING", mode="REPEATED", description="Outlinks in the quoted tweet"),
                bigquery.SchemaField(name="quoteCount", field_type="INTEGER", mode="NULLABLE", description="Number of quotes of the quoted tweet"),
                bigquery.SchemaField(name="replyCount", field_type="INTEGER", mode="NULLABLE", description="Number of replies of the quoted tweet"),
                bigquery.SchemaField(name="retweetCount", field_type="INTEGER", mode="NULLABLE", description="Number of retweets of the quoted tweet"),
                bigquery.SchemaField(name="source", field_type="STRING", mode="NULLABLE", description="Source of the quoted tweet"),
                bigquery.SchemaField(name="sourceLabel", field_type="STRING", mode="NULLABLE", description="Source label of the quoted tweet"),
                bigquery.SchemaField(name="sourceUrl", field_type="STRING", mode="NULLABLE", description="Source URL of the quoted tweet"),
                bigquery.SchemaField(name="tcooutlinks", field_type="STRING", mode="REPEATED", description="t.co outlinks in the quoted tweet"),
                bigquery.SchemaField(name="mentionedUsers", field_type="RECORD", mode="REPEATED", description="Users mentioned in the quoted tweet", fields=[
                    bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the mentioned user"),
                    bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the mentioned user"),
                    bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the mentioned user"),
                    bigquery.SchemaField(name="description", field_type="STRING", mode="NULLABLE", description="Description of the mentioned user"),
                    bigquery.SchemaField(name="rawDescription", field_type="STRING", mode="NULLABLE", description="Raw description of the mentioned user"),
                    bigquery.SchemaField(name="descriptionUrls", field_type="RECORD", mode="REPEATED", description="URLs in the description of the mentioned user", fields=[
                        bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL in the description"),
                        bigquery.SchemaField(name="text", field_type="STRING", mode="NULLABLE", description="Text in the description"),
                        bigquery.SchemaField(name="indices", field_type="INTEGER", mode="REPEATED", description="Indices in the description"),
                        bigquery.SchemaField(name="tcourl", field_type="STRING", mode="NULLABLE", description="t.co URL in the description")
                    ]),
                    bigquery.SchemaField(name="verified", field_type="BOOLEAN", mode="NULLABLE", description="Whether the mentioned user is verified"),
                    bigquery.SchemaField(name="created", field_type="TIMESTAMP", mode="NULLABLE", description="Account creation date of the mentioned user"),
                    bigquery.SchemaField(name="followersCount", field_type="INTEGER", mode="NULLABLE", description="Number of followers of the mentioned user"),
                    bigquery.SchemaField(name="friendsCount", field_type="INTEGER", mode="NULLABLE", description="Number of friends of the mentioned user"),
                    bigquery.SchemaField(name="statusesCount", field_type="INTEGER", mode="NULLABLE", description="Number of statuses of the mentioned user"),
                    bigquery.SchemaField(name="favouritesCount", field_type="INTEGER", mode="NULLABLE", description="Number of favourites of the mentioned user"),
                    bigquery.SchemaField(name="listedCount", field_type="INTEGER", mode="NULLABLE", description="Number of times the mentioned user is listed"),
                    bigquery.SchemaField(name="mediaCount", field_type="INTEGER", mode="NULLABLE", description="Number of media uploaded by the mentioned user"),
                    bigquery.SchemaField(name="location", field_type="STRING", mode="NULLABLE", description="Location of the mentioned user"),
                    bigquery.SchemaField(name="protected", field_type="BOOLEAN", mode="NULLABLE", description="Whether the mentioned user's tweets are protected"),
                    bigquery.SchemaField(name="linkUrl", field_type="STRING", mode="NULLABLE", description="Link URL of the mentioned user"),
                    bigquery.SchemaField(name="linkTcourl", field_type="STRING", mode="NULLABLE", description="Link t.co URL of the mentioned user"),
                    bigquery.SchemaField(name="profileImageUrl", field_type="STRING", mode="NULLABLE", description="Profile image URL of the mentioned user"),
                    bigquery.SchemaField(name="profileBannerUrl", field_type="STRING", mode="NULLABLE", description="Profile banner URL of the mentioned user"),
                    bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the mentioned user")
                ]),
                bigquery.SchemaField(name="retweetedTweet", field_type="RECORD", mode="NULLABLE", description="Retweeted tweet within the quoted tweet", fields=[
                    bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the retweeted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the retweeted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="user", field_type="RECORD", mode="NULLABLE", description="User who posted the retweeted tweet within the quoted tweet", fields=[
                        bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the user who posted the retweeted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the user who posted the retweeted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the user who posted the retweeted tweet within the quoted tweet"),
                    ]),
                    bigquery.SchemaField(name="date", field_type="TIMESTAMP", mode="NULLABLE", description="Date and time of the retweeted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="content", field_type="STRING", mode="NULLABLE", description="Content of the retweeted tweet within the quoted tweet"),
                            ]),
                bigquery.SchemaField(name="quotedTweet", field_type="RECORD", mode="NULLABLE", description="Quoted tweet within the quoted tweet", fields=[
                    bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="user", field_type="RECORD", mode="NULLABLE", description="User who posted the quoted tweet within the quoted tweet", fields=[
                        bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="description", field_type="STRING", mode="NULLABLE", description="Description of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="rawDescription", field_type="STRING", mode="NULLABLE", description="Raw description of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="descriptionUrls", field_type="RECORD", mode="REPEATED", description="URLs in the user's description", fields=[
                            bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL in the description"),
                            bigquery.SchemaField(name="text", field_type="STRING", mode="NULLABLE", description="Text in the description"),
                            bigquery.SchemaField(name="indices", field_type="INTEGER", mode="REPEATED", description="Indices in the description"),
                            bigquery.SchemaField(name="tcourl", field_type="STRING", mode="NULLABLE", description="t.co URL in the description")
                        ]),
                        bigquery.SchemaField(name="verified", field_type="BOOLEAN", mode="NULLABLE", description="Whether the user who posted the quoted tweet within the quoted tweet is verified"),
                        bigquery.SchemaField(name="created", field_type="TIMESTAMP", mode="NULLABLE", description="Account creation date of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="followersCount", field_type="INTEGER", mode="NULLABLE", description="Number of followers of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="friendsCount", field_type="INTEGER", mode="NULLABLE", description="Number of friends of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="statusesCount", field_type="INTEGER", mode="NULLABLE", description="Number of statuses of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="favouritesCount", field_type="INTEGER", mode="NULLABLE", description="Number of favourites of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="listedCount", field_type="INTEGER", mode="NULLABLE", description="Number of times listed of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="mediaCount", field_type="INTEGER", mode="NULLABLE", description="Number of media uploaded of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="location", field_type="STRING", mode="NULLABLE", description="Location of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="protected", field_type="BOOLEAN", mode="NULLABLE", description="Whether the user's who posted the quoted tweet within the quoted tweet tweets are protected"),
                        bigquery.SchemaField(name="linkUrl", field_type="STRING", mode="NULLABLE", description="User's link URL of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="linkTcourl", field_type="STRING", mode="NULLABLE", description="User's link t.co URL of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="profileImageUrl", field_type="STRING", mode="NULLABLE", description="Profile image URL of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="profileBannerUrl", field_type="STRING", mode="NULLABLE", description="Profile banner URL of the user who posted the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the user who posted the quoted tweet within the quoted tweet")
                    ]),
                    bigquery.SchemaField(name="date", field_type="TIMESTAMP", mode="NULLABLE", description="Date and time of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="content", field_type="STRING", mode="NULLABLE", description="Content of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="renderedContent", field_type="STRING", mode="NULLABLE", description="Rendered content of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="conversationId", field_type="INTEGER", mode="NULLABLE", description="Conversation ID of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="lang", field_type="STRING", mode="NULLABLE", description="Language of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="likeCount", field_type="INTEGER", mode="NULLABLE", description="Number of likes of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="media", field_type="RECORD", mode="REPEATED", description="Media attached to the quoted tweet within the quoted tweet", fields=[
                        bigquery.SchemaField(name="duration", field_type="FLOAT", mode="NULLABLE", description="Duration of the media"),
                        bigquery.SchemaField(name="fullUrl", field_type="STRING", mode="NULLABLE", description="Full URL of the media"),
                        bigquery.SchemaField(name="previewUrl", field_type="STRING", mode="NULLABLE", description="Preview URL of the media"),
                        bigquery.SchemaField(name="thumbnailUrl", field_type="STRING", mode="NULLABLE", description="Thumbnail URL of the media"),
                        bigquery.SchemaField(name="type", field_type="STRING", mode="NULLABLE", description="Type of media"),
                        bigquery.SchemaField(name="variants", field_type="RECORD", mode="REPEATED", description="Variants of the media", fields=[
                            bigquery.SchemaField(name="bitrate", field_type="INTEGER", mode="NULLABLE", description="Bitrate of the media variant"),
                            bigquery.SchemaField(name="contentType", field_type="STRING", mode="NULLABLE", description="Content type of the media variant"),
                            bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the media variant")
                        ])
                    ]),
                    bigquery.SchemaField(name="outlinks", field_type="STRING", mode="REPEATED", description="Outlinks in the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="quoteCount", field_type="INTEGER", mode="NULLABLE", description="Number of quotes of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="replyCount", field_type="INTEGER", mode="NULLABLE", description="Number of replies of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="retweetCount", field_type="INTEGER", mode="NULLABLE", description="Number of retweets of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="source", field_type="STRING", mode="NULLABLE", description="Source of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="sourceLabel", field_type="STRING", mode="NULLABLE", description="Source label of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="sourceUrl", field_type="STRING", mode="NULLABLE", description="Source URL of the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="tcooutlinks", field_type="STRING", mode="REPEATED", description="t.co outlinks in the quoted tweet within the quoted tweet"),
                    bigquery.SchemaField(name="retweetedTweet", field_type="RECORD", mode="NULLABLE", description="Retweeted tweet within the quoted tweet within the quoted tweet", fields=[
                        bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the retweeted tweet within the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the retweeted tweet within the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="user", field_type="RECORD", mode="NULLABLE", description="User who posted the retweeted tweet within the quoted tweet within the quoted tweet", fields=[
                            bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the user who posted the retweeted tweet within the quoted tweet within the quoted tweet"),
                            bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the user who posted the retweeted tweet within the quoted tweet within the quoted tweet"),
                            bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the user who posted the retweeted tweet within the quoted tweet within the quoted tweet"),
                        ]),
                        bigquery.SchemaField(name="date", field_type="TIMESTAMP", mode="NULLABLE", description="Date and time of the retweeted tweet within the quoted tweet within the quoted tweet"),
                        bigquery.SchemaField(name="content", field_type="STRING", mode="NULLABLE", description="Content of the retweeted tweet within the quoted tweet within the quoted tweet"),
                    ])
                ])
            ]),
        ])
    ]),
    bigquery.SchemaField(name="mentionedUsers", field_type="RECORD", mode="REPEATED", description="Users mentioned in the tweet", fields=[
        bigquery.SchemaField(name="username", field_type="STRING", mode="NULLABLE", description="Username of the mentioned user"),
        bigquery.SchemaField(name="displayname", field_type="STRING", mode="NULLABLE", description="Display name of the mentioned user"),
        bigquery.SchemaField(name="id", field_type="INTEGER", mode="NULLABLE", description="ID of the mentioned user"),
        bigquery.SchemaField(name="description", field_type="STRING", mode="NULLABLE", description="Description of the mentioned user"),
        bigquery.SchemaField(name="rawDescription", field_type="STRING", mode="NULLABLE", description="Raw description of the mentioned user"),
        bigquery.SchemaField(name="descriptionUrls", field_type="RECORD", mode="REPEATED", description="URLs in the description of the mentioned user", fields=[
            bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL in the description"),
            bigquery.SchemaField(name="text", field_type="STRING", mode="NULLABLE", description="Text in the description"),
            bigquery.SchemaField(name="indices", field_type="INTEGER", mode="REPEATED", description="Indices in the description"),
            bigquery.SchemaField(name="tcourl", field_type="STRING", mode="NULLABLE", description="t.co URL in the description")
        ]),
        bigquery.SchemaField(name="verified", field_type="BOOLEAN", mode="NULLABLE", description="Whether the mentioned user is verified"),
        bigquery.SchemaField(name="created", field_type="TIMESTAMP", mode="NULLABLE", description="Account creation date of the mentioned user"),
        bigquery.SchemaField(name="followersCount", field_type="INTEGER", mode="NULLABLE", description="Number of followers of the mentioned user"),
        bigquery.SchemaField(name="friendsCount", field_type="INTEGER", mode="NULLABLE", description="Number of friends of the mentioned user"),
        bigquery.SchemaField(name="statusesCount", field_type="INTEGER", mode="NULLABLE", description="Number of statuses of the mentioned user"),
        bigquery.SchemaField(name="favouritesCount", field_type="INTEGER", mode="NULLABLE", description="Number of favourites of the mentioned user"),
        bigquery.SchemaField(name="listedCount", field_type="INTEGER", mode="NULLABLE", description="Number of times the mentioned user is listed"),
        bigquery.SchemaField(name="mediaCount", field_type="INTEGER", mode="NULLABLE", description="Number of media uploaded by the mentioned user"),
        bigquery.SchemaField(name="location", field_type="STRING", mode="NULLABLE", description="Location of the mentioned user"),
        bigquery.SchemaField(name="protected", field_type="BOOLEAN", mode="NULLABLE", description="Whether the mentioned user's tweets are protected"),
        bigquery.SchemaField(name="linkUrl", field_type="STRING", mode="NULLABLE", description="Link URL of the mentioned user"),
        bigquery.SchemaField(name="linkTcourl", field_type="STRING", mode="NULLABLE", description="Link t.co URL of the mentioned user"),
        bigquery.SchemaField(name="profileImageUrl", field_type="STRING", mode="NULLABLE", description="Profile image URL of the mentioned user"),
        bigquery.SchemaField(name="profileBannerUrl", field_type="STRING", mode="NULLABLE", description="Profile banner URL of the mentioned user"),
        bigquery.SchemaField(name="url", field_type="STRING", mode="NULLABLE", description="URL of the mentioned user")
    ])
]

Se crea una función que permite cargar la data desde Google Cloud Storage (GCS) hacia Bigquery, cabe destacar que la tabla se particiona por el campo date que contiene la fecha del tweet, esto permite optimizar las consultas a la tabla.

In [6]:
def load_data_from_gcs_to_bigquery(uri, table_id):
    # Inicializa el cliente de BigQuery
    client = bigquery.Client()

    job_config = bigquery.LoadJobConfig(
        schema=schema,
        source_format=bigquery.SourceFormat.NEWLINE_DELIMITED_JSON,
        max_bad_records=0,  # No permitir registros malos antes de fallar
        time_partitioning=bigquery.TimePartitioning(
            type_=bigquery.TimePartitioningType.DAY,
            field="date"  # Campo de partición
        )
    )

    load_job = client.load_table_from_uri(
        uri, table_id, job_config=job_config
    )

    print(f'Starting job {load_job.job_id}')
    load_job.result()
    print(f'Job finished.')

    destination_table = client.get_table(table_id)
    print(f'Loaded {destination_table.num_rows} rows.')

if __name__ == '__main__':
    uri = 'gs://bucket-project-latam-challenge/farmers-protest-tweets-2021-2-4.json'
    table_ref = f"{project_id}.{dataset_id}.{table_id}"

    load_data_from_gcs_to_bigquery(uri, table_ref)

Starting job d2a2f4a4-c5e4-4e88-9677-8e9824e8cab1
Job finished.
Loaded 117407 rows.


# Análisis

## Pregunta 1
Las top 10 fechas donde hay más tweets. Mencionar el usuario (username) que más publicaciones tiene por cada uno de esos días. Debe incluir las siguientes funciones:

Para realizar esta consulta se utiliza bigquery, por lo que se crea un proceso que permita ejecutar este proceso.

In [7]:
from google.cloud import bigquery
from typing import List, Tuple
import datetime

def get_bigquery_client():
    return bigquery.Client()

def run_bigquery_query(query: str):
    client = get_bigquery_client()
    query_job = client.query(query)
    results = query_job.result()
    return results

def q1_time(file_path: str) -> List[Tuple[datetime.date, str]]:
    query = """
    WITH TEMP_DATA_001 AS 
    (
      SELECT DATE(date) AS fecha,
        id,
        user.username AS username
      FROM `project-latam-challenge.twitter_data.farmers_protest_tweets_2021` 
      WHERE DATE(date) IS NOT NULL
    ), TEMP_DATE_001 AS 
    (
      SELECT fecha,
        COUNT(DISTINCT id) AS tweet_qty
      FROM TEMP_DATA_001
      GROUP BY fecha
    ), TEMP_DATE_002 AS 
    (
      SELECT A.*,
        RANK() OVER(ORDER BY tweet_qty DESC, RAND()) AS ranking_tweet
      FROM TEMP_DATE_001 A
    ), TEMP_USER_001 AS 
    (
      SELECT fecha,
        username,
        COUNT(DISTINCT id) AS tweet_qty
      FROM TEMP_DATA_001
      GROUP BY fecha, username
    ), TEMP_USER_002 AS 
    (
      SELECT A.*,
        RANK() OVER(PARTITION BY fecha ORDER BY tweet_qty DESC, RAND()) AS ranking_user
      FROM TEMP_USER_001 A
    ), TEMP_USER_003 AS 
    (
      SELECT A.*
      FROM TEMP_USER_002 A
      WHERE ranking_user=1
    )
    SELECT A.fecha,
      B.username
    FROM TEMP_DATE_002 A
    LEFT JOIN TEMP_USER_003 B
    ON A.fecha=B.fecha 
    WHERE A.ranking_tweet<=10
    ORDER BY ranking_tweet
    """
    
    results = run_bigquery_query(query)
    return [(row.fecha, row.username) for row in results]

if __name__ == "__main__":
    file_path = '../data/farmers_protest_tweets_2021.json'
    top_10_dates = q1_time(file_path)
    for date, user in top_10_dates:
        print(f"Fecha: {date}, Usuario: {user}")


Fecha: 2021-02-12, Usuario: RanbirS00614606
Fecha: 2021-02-13, Usuario: MaanDee08215437
Fecha: 2021-02-17, Usuario: RaaJVinderkaur
Fecha: 2021-02-16, Usuario: jot__b
Fecha: 2021-02-14, Usuario: rebelpacifist
Fecha: 2021-02-18, Usuario: neetuanjle_nitu
Fecha: 2021-02-15, Usuario: jot__b
Fecha: 2021-02-20, Usuario: MangalJ23056160
Fecha: 2021-02-23, Usuario: Surrypuria
Fecha: 2021-02-19, Usuario: Preetm91
