# Essays Batch Processing 🌊

Analysis of the texts generated by the **Persona-Predict V2** 🧠 service. Performs **batch** processing of **100** texts from the *Brasil Escola* 🇧🇷 dataset. 

  - [Dataset Brasil Escola](https://github.com/gpassero/uol-redacoes-xml/tree/master/brasilescola)
  - [Analysis Notebook](https://nbviewer.org/github/NeuroQuestAi/neuroquest-examples/blob/main/products/persona-predict/notebooks/Persona-Predict-Pop-PT-BR.ipynb?flush_cache=true)

If the graphics are not rendered 🚫, use the address:

  - [View in NBViewer](https://nbviewer.org/github/NeuroQuestAi/neuroquest-examples/blob/main/products/persona-predict/notebooks/Persona-Predict-Batch-PT-BR.ipynb?flush_cache=true)

In [1]:
import os
import glob 

import pandas as pd
import utility as U

from dotenv import load_dotenv

load_dotenv()

True

In [2]:
NQ_USER = os.getenv("NQ_USER")
NQ_PASSWORD = os.getenv("NQ_PASSWORD")

In [3]:
assert NQ_USER is not None, "set the user"
assert NQ_PASSWORD is not None, "set the password"

In [4]:
NQ_TOKEN = U.api_login(user=NQ_USER, password=NQ_PASSWORD).get("data").get("token")

In [5]:
assert NQ_TOKEN is not None, "set the user"

In [6]:
list(map(os.remove, glob.glob("results/batch/*.json")))
print("All JSON analysis removed from results/batch.")

All JSON analysis removed from results/batch.


## 1. Reading and Preparing Essays 📝💡

In [7]:
df = pd.read_csv("essays/br-escola.csv.gz", compression="gzip")
assert len(df) == 1000, "shoud be 1000!"

In [8]:
df = df.sample(n=100, random_state=42)
assert len(df) == 100, "shoud be 100!"

## 2. Essays Processing 📜🚀 

- The data is saved in *JSON* format in the *results/batch/* directory.

In [None]:
df["essay"].apply(lambda essay: U.create_batch_analysis(essay, NQ_TOKEN))

521    {'code': 201, 'status': 'success', 'data': {'d...
737    {'code': 201, 'status': 'success', 'data': {'d...
740    {'code': 201, 'status': 'success', 'data': {'d...
660    {'code': 201, 'status': 'success', 'data': {'d...
411    {'code': 201, 'status': 'success', 'data': {'d...
                             ...                        
436    {'code': 201, 'status': 'success', 'data': {'d...
764    {'code': 201, 'status': 'success', 'data': {'d...
88     {'code': 201, 'status': 'success', 'data': {'d...
63     {'code': 201, 'status': 'success', 'data': {'d...
826    {'code': 201, 'status': 'success', 'data': {'d...
Name: essay, Length: 100, dtype: object

## 3. Check the generated results 🖋️🔧 

In [10]:
json_files = glob.glob("results/batch/*.json")
file_count = len(json_files)

print(f"Number of JSON analysis found: {file_count}")
assert file_count == 100, "shoud be 100!"

Number of JSON analysis found: 100
