Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adiciona validação dos dados com Goodtables.io (#4) #5

Merged
merged 2 commits into from Mar 16, 2020

Conversation

augusto-herrmann
Copy link
Contributor

Conforme descrito na issue #4, este pull request adiciona um esquema para a tabela segundo o padrão Table Schema e validação automática pelo serviço Goodtables.io.

Alguns erros já detectados automaticamente pela ferramenta no estado atual dos dados:

$ goodtables dados/datapackage.json

DATASET
=======
{'error-count': 15,
 'preset': 'nested',
 'table-count': 1,
 'time': 0.25,
 'valid': False}

TABLE [1]
=========
{'datapackage': 'dados/datapackage.json',
 'error-count': 15,
 'format': 'inline',
 'headers': ['uid',
             'suspects',
             'refuses',
             'confirmado',
             'deads',
             'local',
             'cases',
             'comments',
             'broadcast',
             'date',
             'time',
             'uf'],
 'resource-name': 'coronabr',
 'row-count': 604,
 'schema': 'table-schema',
 'source': '/home/herrmann/dev/coronabr/dados/coronabr.csv',
 'time': 0.106,
 'valid': False}
---------
[58,-] [duplicate-row] Row 58 is duplicated to row(s) 51
[59,-] [duplicate-row] Row 59 is duplicated to row(s) 52
[60,-] [duplicate-row] Row 60 is duplicated to row(s) 53
[61,-] [duplicate-row] Row 61 is duplicated to row(s) 54
[62,-] [duplicate-row] Row 62 is duplicated to row(s) 55
[63,-] [duplicate-row] Row 63 is duplicated to row(s) 56
[64,-] [duplicate-row] Row 64 is duplicated to row(s) 57
[211,-] [duplicate-row] Row 211 is duplicated to row(s) 203
[212,-] [duplicate-row] Row 212 is duplicated to row(s) 204
[213,-] [duplicate-row] Row 213 is duplicated to row(s) 205
[214,-] [duplicate-row] Row 214 is duplicated to row(s) 206
[215,-] [duplicate-row] Row 215 is duplicated to row(s) 207
[216,-] [duplicate-row] Row 216 is duplicated to row(s) 208
[217,-] [duplicate-row] Row 217 is duplicated to row(s) 209
[218,-] [duplicate-row] Row 218 is duplicated to row(s) 210

Usar essa ferramenta permite detectar erros como estes (linhas duplicadas) e outros eventuais possíveis erros (ex.: data inválida) automaticamente a cada commit ao repositório.

@belisards belisards merged commit 2cee071 into belisards:master Mar 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants