DRAFT: Pull request do issue 16 do shared resources #7

pedruck · 2025-04-29T04:27:27Z

Foram implementadas as seguintes mudanças:

Foi feita a comparação entre as informações puxadas da database scripts e as informações obtidas pelo sheet 2025.1 disponibilizado. A partir dessa comparação é atribuido um novo parametro booleano na collections "teacher" denominado "ACTIVE" que informa se o professor está ativo (true) ou não ativo (false) (informa false caso um professor que estava previamente na database não esta presente no novo sheet 2025.1)
Adicionado novos parametros na collection "offers" que determina se a oferta ja estava na database (atribuindo as tags ano: 2024 e semestre 2) ou caso ela veio do novo sheet 2025.1 (atribuindo as tags ano: 2025 e semestre: 1)
Codigo foi compactado de forma que possa ser executado apartir de um só arquivo (main.py)

OBS: A database que é utilizada para comparação com o sheet e que depois é atualizada é definida no .env (pode ser "scripts" ou "shared-resources")

as unicas collections atualizadas são:

offers e teachers (atualizadas conforme descrito acima)
disciplines (somente atualizada conforme as informações obtidas no sheet)

-Logica de execução:

JSONS obtidos pelo sheet ficam no diretorio jsonfiles
JSONS obtidos pela database ficam no diretorio oldjsonfiles
JSONS obtidos atraves do processamento, atualização e comparação de dados entre os respectivos JSONS desses diretorios acima e que serão posteriormente mandados pra database de escolha (script ou shared-resources) ficam no diretorio comparedjsonfiles

comparação entre professores ativos e não ativos feita. (professores ativos terão o STATUS : ACTIVE enquanto os não ativos terão o STATUS: DEACTIVE)

codigo ainda incompleto em algumas partes

preparando para a pr

Copilot

Pull Request Overview

This pull request implements data comparison and normalization changes to support shared resource updates, integrating information from both database scripts and an updated sheet. Key changes include:

Refactoring discipline normalization into a new class structure.
Orchestrating JSON generation, comparison, and upload via a consolidated main.py.
Adjusting file paths and parameter values (e.g., year/semester) to align with the new sheet for offers and teachers.

Reviewed Changes

Copilot reviewed 36 out of 49 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
processamento_planilha/normalize.py	Introduces a NormalizeDisciplinas class to encapsulate discipline normalization with an updated file path.
processamento_planilha/main.py	Adds a main orchestrator that sequentially calls JSON generation, comparison, normalization, and upload.
processamento_planilha/load_sheet_to_mongodb.py	Updates file path and adjusts static year/semester values for sheet data.
processamento_planilha/get_old_collections_from_mongo.py	Extracts collections from MongoDB with minor formatting improvements.
processamento_planilha/extract_collections.py	Provides comprehensive data extraction and JSON generation from an Excel sheet.
processamento_planilha/comparando_jsons.py	Implements the ComparisonModule for merging JSON data with deactivation logic.
extract_collections.py	Removes duplicate extraction logic to consolidate functionality.

Files not reviewed (13)

.env.example: Language not supported
Dockerfile: Language not supported
new_collection/CAMPUS.json: Language not supported
new_collection/PERIODOS.json: Language not supported
new_collection/SALAS.json: Language not supported
processamento_planilha/comparedjsonfiles/CAMPUS.json: Language not supported
processamento_planilha/comparedjsonfiles/CURSOS.json: Language not supported
processamento_planilha/comparedjsonfiles/PERIODOS.json: Language not supported
processamento_planilha/jsonfiles/CAMPUS.json: Language not supported
processamento_planilha/jsonfiles/CURSOS.json: Language not supported
processamento_planilha/jsonfiles/PERIODOS.json: Language not supported
processamento_planilha/normalize.py~: Language not supported
processamento_planilha/oldjsonfiles/CAMPUS.json: Language not supported

Copilot · 2025-05-03T19:24:13Z

processamento_planilha/normalize.py

+class NormalizeDisciplinas:
+    @staticmethod
+    def normalize_disciplinas():


[nitpick] Consider adding a docstring for the NormalizeDisciplinas class and its normalize_disciplinas method to clearly describe its purpose and expected behavior.

Suggested change

class NormalizeDisciplinas:

@staticmethod

def normalize_disciplinas():

class NormalizeDisciplinas:

"""

A class responsible for normalizing and updating the JSON file containing discipline data.

It processes the data by grouping documents with the same `_id` and consolidating their

`COD_CURS` values into a unique list.

"""

@staticmethod

def normalize_disciplinas():

"""

Normalizes the discipline data in the JSON file.

This method performs the following steps:

1. Loads the discipline data from the JSON file.

2. Groups documents by `_id` and consolidates their `COD_CURS` values.

3. Saves the updated data back to the JSON file.

"""

Copilot · 2025-05-03T19:24:13Z

processamento_planilha/load_sheet_to_mongodb.py


 # Load the Excel file
-file_path = 'sheet.xlsx'  # Replace with your actual file path
+file_path = '../sheet.xlsx'  # Replace with your actual file path


[nitpick] Using a relative path here may lead to issues if the script is executed from an unexpected directory; consider using a configurable base directory or an absolute path to ensure reliable file access.

Suggested change

file_path = '../sheet.xlsx' # Replace with your actual file path

file_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), '../sheet.xlsx') # Construct absolute path

Copilot · 2025-05-03T19:24:14Z

processamento_planilha/comparando_jsons.py

+        for offer in newjson:
+            offer['ANO'] = 2025
+            offer['SEMESTRE'] = 1
+
+        resultjson = newjson + oldjson
+


The merging of new and old offer JSON data in GenerateComparedOffers could result in duplicate entries if an offer exists in both; consider implementing a de-duplication step before writing the result.

Suggested change

for offer in newjson:

offer['ANO'] = 2025

offer['SEMESTRE'] = 1

resultjson = newjson + oldjson

# Update new offers with the specified year and semester

for offer in newjson:

offer['ANO'] = 2025

offer['SEMESTRE'] = 1

# Use a dictionary to de-duplicate offers based on a unique key

offers_dict = {}

for offer in newjson + oldjson:

unique_key = f"{offer['ID']}" # Replace 'ID' with the actual unique field(s)

offers_dict[unique_key] = offer

# Convert the dictionary values back to a list

resultjson = list(offers_dict.values())

# Write the de-duplicated result to the output file

pedruck added 4 commits April 27, 2025 17:39

atualizacão e registro dos professores antigos

ed677cd

comparação entre professores ativos e não ativos feita. (professores ativos terão o STATUS : ACTIVE enquanto os não ativos terão o STATUS: DEACTIVE)

implementação da atualização da tag do offers

29a5a57

codigo ainda incompleto em algumas partes

correção de bugs e finalização a da main

68df297

ajustes finais

d2a87d5

preparando para a pr

pedruck requested a review from danrleypereira April 29, 2025 04:27

GuiMcs00 requested review from Copilot and removed request for danrleypereira May 3, 2025 19:23

Copilot AI reviewed May 3, 2025

View reviewed changes

docker compose e readme completos

922e50c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DRAFT: Pull request do issue 16 do shared resources #7

DRAFT: Pull request do issue 16 do shared resources #7

Uh oh!

pedruck commented Apr 29, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI May 3, 2025

Uh oh!

Copilot AI May 3, 2025

Uh oh!

Copilot AI May 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-class NormalizeDisciplinas:
-    @staticmethod
-    def normalize_disciplinas():
+class NormalizeDisciplinas:
+    """
+    A class responsible for normalizing and updating the JSON file containing discipline data.
+    It processes the data by grouping documents with the same `_id` and consolidating their
+    `COD_CURS` values into a unique list.
+    """
+    @staticmethod
+    def normalize_disciplinas():
+        """
+        Normalizes the discipline data in the JSON file.
+        This method performs the following steps:
+. Loads the discipline data from the JSON file.
+. Groups documents by `_id` and consolidates their `COD_CURS` values.
+. Saves the updated data back to the JSON file.
+        """

	file_path = '../sheet.xlsx' # Replace with your actual file path
	file_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), '../sheet.xlsx') # Construct absolute path

-        for offer in newjson:
-            offer['ANO'] = 2025
-            offer['SEMESTRE'] = 1
-        resultjson = newjson + oldjson
+        # Update new offers with the specified year and semester
+        for offer in newjson:
+            offer['ANO'] = 2025
+            offer['SEMESTRE'] = 1
+        # Use a dictionary to de-duplicate offers based on a unique key
+        offers_dict = {}
+        for offer in newjson + oldjson:
+            unique_key = f"{offer['ID']}"  # Replace 'ID' with the actual unique field(s)
+            offers_dict[unique_key] = offer
+        # Convert the dictionary values back to a list
+        resultjson = list(offers_dict.values())
+        # Write the de-duplicated result to the output file

DRAFT: Pull request do issue 16 do shared resources #7

Are you sure you want to change the base?

DRAFT: Pull request do issue 16 do shared resources #7

Uh oh!

Conversation

pedruck commented Apr 29, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI May 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI May 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI May 3, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants