Ajusta a atribuição de dados de país e de estado de instituições rela… by robertatakenaka · Pull Request #1321 · scieloorg/core

robertatakenaka · 2026-02-16T16:44:31Z

…cionadas ao journal

O que esse PR faz?

Fale sobre o propósito do pull request como por exemplo: quais problemas ele soluciona ou quais features ele adiciona.

Onde a revisão poderia começar?

Indique o caminho do arquivo e o arquivo onde o revisor deve iniciar a leitura do código.

Como este poderia ser testado manualmente?

Estabeleça os passos necessários para que a funcionalidade seja testada manualmente pelo revisor.

Algum cenário de contexto que queira dar?

Indique um contexto onde as modificações se fazem necessárias ou passe informações que contextualizam
o revisor a fim de facilitar o entendimento da funcionalidade.

Screenshots

Quando aplicável e se fizer possível adicione screenshots que remetem a situação gráfica do problema que o pull request resolve.

Quais são tickets relevantes?

Indique uma issue ao qual o pull request faz relacionamento.

Referências

Indique as referências utilizadas para a elaboração do pull request.

…cionadas ao journal

Copilot

Pull request overview

Este PR ajusta a forma como os campos raw_country_* e raw_state_* são atribuídos ao criar itens de histórico de instituições (publisher/owner/sponsor/copyright holder) associados a um journal, tentando inferir se um único valor recebido representa “nome” ou “código/sigla”.

Changes:

Adiciona lógica condicional em Journal._add_institution_history para preencher raw_country_name/raw_country_code quando apenas um dos dois for informado.
Adiciona lógica semelhante para raw_state_name/raw_state_acron quando apenas um dos dois for informado.
Mantém o preenchimento direto quando ambos (nome e código/sigla) são informados.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-16T16:48:37Z

journal/models.py

+            raw_country = raw_country_name or raw_country_code
+            if raw_country.upper() == raw_country and len(raw_country) == 2:
+                institution_history.raw_country_code = raw_country
+            else:
+                institution_history.raw_country_name = raw_country
+
+        if raw_state_acron and raw_state_name:
+            institution_history.raw_state_name = raw_state_name
+            institution_history.raw_state_acron = raw_state_acron
+        elif raw_state_name or raw_state_acron:
+            raw_state = raw_state_name or raw_state_acron
+            if raw_state.upper() == raw_state and len(raw_state) == 2:
+                institution_history.raw_state_acron = raw_state
+            else:
+                institution_history.raw_state_name = raw_state


raw_country = raw_country_name or raw_country_code can be a non-string (e.g., extract_value(...) returns a list when there are multiple values). Calling raw_country.upper() will then raise AttributeError and break imports/tasks that call add_publisher/add_owner with only raw_country_name. Normalize raw_country to a string (e.g., pick first element or join) and strip() before applying case/length heuristics.

Suggested change

raw_country = raw_country_name or raw_country_code

if raw_country.upper() == raw_country and len(raw_country) == 2:

institution_history.raw_country_code = raw_country

else:

institution_history.raw_country_name = raw_country

if raw_state_acron and raw_state_name:

institution_history.raw_state_name = raw_state_name

institution_history.raw_state_acron = raw_state_acron

elif raw_state_name or raw_state_acron:

raw_state = raw_state_name or raw_state_acron

if raw_state.upper() == raw_state and len(raw_state) == 2:

institution_history.raw_state_acron = raw_state

else:

institution_history.raw_state_name = raw_state

raw_country_value = raw_country_name or raw_country_code

# Normalize raw_country to a string before applying heuristics

if isinstance(raw_country_value, (list, tuple)):

raw_country_normalized = ", ".join(str(item) for item in raw_country_value if item is not None)

else:

raw_country_normalized = str(raw_country_value)

raw_country_normalized = raw_country_normalized.strip()

if raw_country_normalized:

if (

raw_country_normalized.upper() == raw_country_normalized

and len(raw_country_normalized) == 2

):

institution_history.raw_country_code = raw_country_normalized

else:

institution_history.raw_country_name = raw_country_normalized

if raw_state_acron and raw_state_name:

institution_history.raw_state_name = raw_state_name

institution_history.raw_state_acron = raw_state_acron

elif raw_state_name or raw_state_acron:

raw_state_value = raw_state_name or raw_state_acron

# Normalize raw_state to a string before applying heuristics

if isinstance(raw_state_value, (list, tuple)):

raw_state_normalized = ", ".join(str(item) for item in raw_state_value if item is not None)

else:

raw_state_normalized = str(raw_state_value)

raw_state_normalized = raw_state_normalized.strip()

if raw_state_normalized:

if (

raw_state_normalized.upper() == raw_state_normalized

and len(raw_state_normalized) == 2

):

institution_history.raw_state_acron = raw_state_normalized

else:

institution_history.raw_state_name = raw_state_normalized

Copilot · 2026-02-16T16:48:37Z

journal/models.py

+            raw_country = raw_country_name or raw_country_code
+            if raw_country.upper() == raw_country and len(raw_country) == 2:
+                institution_history.raw_country_code = raw_country


The country-code heuristic only treats 2-letter uppercase strings as a code (len(...) == 2), but this project stores ISO alpha-3 codes too (raw_country_code has max_length=3 and code elsewhere uses acron3 like BRA/ARG). This will misfile values like BRA (or lowercase br/bra) into raw_country_name. Consider detecting 2–3 letter alphabetic codes case-insensitively and storing them uppercased.

Suggested change

raw_country = raw_country_name or raw_country_code

if raw_country.upper() == raw_country and len(raw_country) == 2:

institution_history.raw_country_code = raw_country

raw_country = (raw_country_name or raw_country_code).strip()

if raw_country and raw_country.isalpha() and 2 <= len(raw_country) <= 3:

institution_history.raw_country_code = raw_country.upper()

Copilot · 2026-02-16T16:48:38Z

journal/models.py

+            if raw_country.upper() == raw_country and len(raw_country) == 2:
+                institution_history.raw_country_code = raw_country
+            else:
+                institution_history.raw_country_name = raw_country
+
+        if raw_state_acron and raw_state_name:
+            institution_history.raw_state_name = raw_state_name
+            institution_history.raw_state_acron = raw_state_acron
+        elif raw_state_name or raw_state_acron:
+            raw_state = raw_state_name or raw_state_acron
+            if raw_state.upper() == raw_state and len(raw_state) == 2:
+                institution_history.raw_state_acron = raw_state
+            else:
+                institution_history.raw_state_name = raw_state


The same heuristic issue exists for state: if only one value is provided and it's a lowercase acronym (e.g. sp) or a non-string (list from extract_value), the current raw_state.upper() == raw_state and len(raw_state) == 2 check will either misclassify it as a name or raise at runtime. Normalize to string/strip first and consider treating 2-letter alphabetic values case-insensitively as acronyms (storing uppercased).

Suggested change

if raw_country.upper() == raw_country and len(raw_country) == 2:

institution_history.raw_country_code = raw_country

else:

institution_history.raw_country_name = raw_country

if raw_state_acron and raw_state_name:

institution_history.raw_state_name = raw_state_name

institution_history.raw_state_acron = raw_state_acron

elif raw_state_name or raw_state_acron:

raw_state = raw_state_name or raw_state_acron

if raw_state.upper() == raw_state and len(raw_state) == 2:

institution_history.raw_state_acron = raw_state

else:

institution_history.raw_state_name = raw_state

raw_country_str = str(raw_country).strip()

if len(raw_country_str) == 2 and raw_country_str.isalpha():

institution_history.raw_country_code = raw_country_str.upper()

else:

institution_history.raw_country_name = raw_country_str

if raw_state_acron and raw_state_name:

institution_history.raw_state_name = raw_state_name

institution_history.raw_state_acron = raw_state_acron

elif raw_state_name or raw_state_acron:

raw_state = raw_state_name or raw_state_acron

raw_state_str = str(raw_state).strip()

if len(raw_state_str) == 2 and raw_state_str.isalpha():

institution_history.raw_state_acron = raw_state_str.upper()

else:

institution_history.raw_state_name = raw_state_str

Copilot · 2026-02-16T16:48:38Z

journal/models.py

+            if raw_country.upper() == raw_country and len(raw_country) == 2:
+                institution_history.raw_country_code = raw_country
+            else:
+                institution_history.raw_country_name = raw_country
+
+        if raw_state_acron and raw_state_name:
+            institution_history.raw_state_name = raw_state_name
+            institution_history.raw_state_acron = raw_state_acron
+        elif raw_state_name or raw_state_acron:
+            raw_state = raw_state_name or raw_state_acron
+            if raw_state.upper() == raw_state and len(raw_state) == 2:
+                institution_history.raw_state_acron = raw_state
+            else:
+                institution_history.raw_state_name = raw_state
+


This change adds new branching behavior to interpret a single raw_country_* / raw_state_* input (code vs name), but tests only cover the case where both name+code are provided. Add regression tests for code-only inputs (2- and 3-letter, mixed/lowercase) and for list inputs returned by extract_value so imports/tasks don’t crash and values land in the intended fields.

Suggested change

if raw_country.upper() == raw_country and len(raw_country) == 2:

institution_history.raw_country_code = raw_country

else:

institution_history.raw_country_name = raw_country

if raw_state_acron and raw_state_name:

institution_history.raw_state_name = raw_state_name

institution_history.raw_state_acron = raw_state_acron

elif raw_state_name or raw_state_acron:

raw_state = raw_state_name or raw_state_acron

if raw_state.upper() == raw_state and len(raw_state) == 2:

institution_history.raw_state_acron = raw_state

else:

institution_history.raw_state_name = raw_state

# Handle list/tuple inputs (e.g., from extract_value) by taking the first element.

if isinstance(raw_country, (list, tuple)):

raw_country = raw_country[0] if raw_country else None

if raw_country is not None:

# Normalize to string for classification.

raw_country_str = str(raw_country).strip()

if raw_country_str and raw_country_str.isalpha() and len(raw_country_str) in (2, 3):

# Treat 2- or 3-letter alphabetic values as country codes, normalizing to upper case.

institution_history.raw_country_code = raw_country_str.upper()

else:

institution_history.raw_country_name = raw_country_str

if raw_state_acron and raw_state_name:

institution_history.raw_state_name = raw_state_name

institution_history.raw_state_acron = raw_state_acron

elif raw_state_name or raw_state_acron:

raw_state = raw_state_name or raw_state_acron

# Handle list/tuple inputs (e.g., from extract_value) by taking the first element.

if isinstance(raw_state, (list, tuple)):

raw_state = raw_state[0] if raw_state else None

if raw_state is not None:

# Normalize to string for classification.

raw_state_str = str(raw_state).strip()

if raw_state_str and raw_state_str.isalpha() and len(raw_state_str) == 2:

# Treat 2-letter alphabetic values as state acronyms, normalizing to upper case.

institution_history.raw_state_acron = raw_state_str.upper()

else:

institution_history.raw_state_name = raw_state_str

Ajusta a atribuição de dados de país e de estado de instituições rela…

de3c637

…cionadas ao journal

Copilot AI review requested due to automatic review settings February 16, 2026 16:44

robertatakenaka merged commit 8e4e9cb into scieloorg:main Feb 16, 2026
7 of 9 checks passed

Copilot started reviewing on behalf of robertatakenaka February 16, 2026 16:44 View session

Copilot AI reviewed Feb 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ajusta a atribuição de dados de país e de estado de instituições rela…#1321

Ajusta a atribuição de dados de país e de estado de instituições rela…#1321
robertatakenaka merged 1 commit intoscieloorg:mainfrom
robertatakenaka:reorganiza_formulario_de_journal_para_editores_v2

robertatakenaka commented Feb 16, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 16, 2026

Uh oh!

Copilot AI Feb 16, 2026

Uh oh!

Copilot AI Feb 16, 2026

Uh oh!

Copilot AI Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-            raw_country = raw_country_name or raw_country_code
-            if raw_country.upper() == raw_country and len(raw_country) == 2:
-                institution_history.raw_country_code = raw_country
-            else:
-                institution_history.raw_country_name = raw_country
-        if raw_state_acron and raw_state_name:
-            institution_history.raw_state_name = raw_state_name
-            institution_history.raw_state_acron = raw_state_acron
-        elif raw_state_name or raw_state_acron:
-            raw_state = raw_state_name or raw_state_acron
-            if raw_state.upper() == raw_state and len(raw_state) == 2:
-                institution_history.raw_state_acron = raw_state
-            else:
-                institution_history.raw_state_name = raw_state
+            raw_country_value = raw_country_name or raw_country_code
+            # Normalize raw_country to a string before applying heuristics
+            if isinstance(raw_country_value, (list, tuple)):
+                raw_country_normalized = ", ".join(str(item) for item in raw_country_value if item is not None)
+            else:
+                raw_country_normalized = str(raw_country_value)
+            raw_country_normalized = raw_country_normalized.strip()
+            if raw_country_normalized:
+                if (
+                    raw_country_normalized.upper() == raw_country_normalized
+                    and len(raw_country_normalized) == 2
+                ):
+                    institution_history.raw_country_code = raw_country_normalized
+                else:
+                    institution_history.raw_country_name = raw_country_normalized
+        if raw_state_acron and raw_state_name:
+            institution_history.raw_state_name = raw_state_name
+            institution_history.raw_state_acron = raw_state_acron
+        elif raw_state_name or raw_state_acron:
+            raw_state_value = raw_state_name or raw_state_acron
+            # Normalize raw_state to a string before applying heuristics
+            if isinstance(raw_state_value, (list, tuple)):
+                raw_state_normalized = ", ".join(str(item) for item in raw_state_value if item is not None)
+            else:
+                raw_state_normalized = str(raw_state_value)
+            raw_state_normalized = raw_state_normalized.strip()
+            if raw_state_normalized:
+                if (
+                    raw_state_normalized.upper() == raw_state_normalized
+                    and len(raw_state_normalized) == 2
+                ):
+                    institution_history.raw_state_acron = raw_state_normalized
+                else:
+                    institution_history.raw_state_name = raw_state_normalized

-            if raw_country.upper() == raw_country and len(raw_country) == 2:
-                institution_history.raw_country_code = raw_country
-            else:
-                institution_history.raw_country_name = raw_country
-        if raw_state_acron and raw_state_name:
-            institution_history.raw_state_name = raw_state_name
-            institution_history.raw_state_acron = raw_state_acron
-        elif raw_state_name or raw_state_acron:
-            raw_state = raw_state_name or raw_state_acron
-            if raw_state.upper() == raw_state and len(raw_state) == 2:
-                institution_history.raw_state_acron = raw_state
-            else:
-                institution_history.raw_state_name = raw_state
+            # Handle list/tuple inputs (e.g., from extract_value) by taking the first element.
+            if isinstance(raw_country, (list, tuple)):
+                raw_country = raw_country[0] if raw_country else None
+            if raw_country is not None:
+                # Normalize to string for classification.
+                raw_country_str = str(raw_country).strip()
+                if raw_country_str and raw_country_str.isalpha() and len(raw_country_str) in (2, 3):
+                    # Treat 2- or 3-letter alphabetic values as country codes, normalizing to upper case.
+                    institution_history.raw_country_code = raw_country_str.upper()
+                else:
+                    institution_history.raw_country_name = raw_country_str
+        if raw_state_acron and raw_state_name:
+            institution_history.raw_state_name = raw_state_name
+            institution_history.raw_state_acron = raw_state_acron
+        elif raw_state_name or raw_state_acron:
+            raw_state = raw_state_name or raw_state_acron
+            # Handle list/tuple inputs (e.g., from extract_value) by taking the first element.
+            if isinstance(raw_state, (list, tuple)):
+                raw_state = raw_state[0] if raw_state else None
+            if raw_state is not None:
+                # Normalize to string for classification.
+                raw_state_str = str(raw_state).strip()
+                if raw_state_str and raw_state_str.isalpha() and len(raw_state_str) == 2:
+                    # Treat 2-letter alphabetic values as state acronyms, normalizing to upper case.
+                    institution_history.raw_state_acron = raw_state_str.upper()
+                else:
+                    institution_history.raw_state_name = raw_state_str

Conversation

robertatakenaka commented Feb 16, 2026

O que esse PR faz?

Onde a revisão poderia começar?

Como este poderia ser testado manualmente?

Algum cenário de contexto que queira dar?

Screenshots

Quais são tickets relevantes?

Referências

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants