Allow /src scripts to receive data files as command line arguments #167

Irio · 2016-12-14T18:25:47Z

The majority of scripts located in process datasets in hard coded locations, like data/cnpj-info.xz in src/fetch_cnpj_info.py#L10. Given a lot has changed since their creation, we expect them to receive data paths as command line arguments, as in python src/fetch_cnpj_info.py data/2016-12-06-cnpj-info.xz.

The text was updated successfully, but these errors were encountered:

marcusrehm · 2016-12-15T00:27:39Z

Hi @Irio ! I'm working exactly that. I spoke with @cuducos about issue #67 that I also need to use fetch_cnpj_info with the amendments' dataset .

Actually, I was thinking about passing the column with the CNPJ's as a parameter also, so the call would be src/fetch_cnpj_info.py './data/2016-12-06-cnpj-info.xz' 'cnpj'. What do you think?

marcusrehm · 2016-12-15T03:27:28Z

Hi @Irio

I did the refactoring in fetch_cnpj_info.py now we can call it passing filename and column with CNPJ's.

cuducos · 2016-12-15T20:51:11Z

Sorry if I missed anything… but I don't think passing the column we need is what I had in mind. When I read OP's description I think he meant the file to be load (for example 2016-08-08-companies.xz or 2016-12-06-companies.xz). Also I'm afraid that's not what I meant in #67, @marcusrehm (but I'm gonna clarify that on the issue page).

marcusrehm · 2017-02-03T11:36:29Z

@Irio @cuducos I think this one was resolved with PR #185 as in https://github.com/datasciencebr/serenata-de-amor/blob/master/src/fetch_cnpj_info.py#L159 .

cuducos · 2017-02-03T14:04:47Z

@marcusrehm PR #185 addresses this issue when it come to company scripts, but not for all scripts inside src/ directory ; )

marcusrehm · 2017-02-05T11:24:08Z

yes, @cuducos , you're right! =)

martini97 · 2017-04-28T23:47:55Z

I would like to know specifically what is the idea in this issue. I've looked through some files in source, and some of then just download data, would you like the script to specify the path where the data is to be saved? Or would you like only for the scripts that read files to receive arguments? Anyway I think it wouuld be nice to post a roadmap with the files that you'd like to change. Thanks.

cuducos · 2017-04-29T01:23:30Z

marcusrehm · 2017-04-29T02:51:46Z

Hi @martini97 ! I think what @cuducos suggested was commented here. He's asking to create a sort of mapping like it:

# not functional, just a example
cols = {'amendments': 'beneficiary', 'other_dataset': 'something_else}
cnpj_col = cols.get(base_file_name, 'cnpj')

So when the script receive a file as argument it can grab data (CNPJs) from the referenced column and to the job.

It is already done in fetch_cnpj_info.py for the following datasets:

datasets_cols = {'reimbursements': 'cnpj_cpf',
                 'current-year': 'cnpj_cpf',
                 'last-year': 'cnpj_cpf',
                 'previous-years': 'cnpj_cpf',
                 'amendments': 'amendment_beneficiary'}

Is that right @cuducos ?

cuducos · 2017-04-29T05:17:30Z

👍

…s-friendly-name Add human friendly name for irregular companies classifier

willianpaixao · 2018-10-04T03:21:37Z

@Irio @cuducos is this still an issue?

cuducos · 2018-10-04T10:49:45Z

Closed because this src/ folder is not in use anymore.

marcusrehm mentioned this issue Dec 15, 2016

Looking for corruption on the Federal Budget #67

Closed

cuducos added the infrastructure label Jan 5, 2017

marcusrehm mentioned this issue Jan 27, 2017

Issue #67 - Looking for corruption on the Federal Budget #185

Merged

cuducos added the hacktoberfest label Mar 24, 2017

martini97 mentioned this issue Apr 29, 2017

Allow src/clean_cnpj_info_dataset.py scripts to receive data files as cli args #221

Closed

Irio pushed a commit that referenced this issue Feb 27, 2018

Merge pull request #167 from datasciencebr/cuducos-irregular-companie…

0ff7991

…s-friendly-name Add human friendly name for irregular companies classifier

cuducos closed this as completed Oct 4, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow /src scripts to receive data files as command line arguments #167

Allow /src scripts to receive data files as command line arguments #167

Irio commented Dec 14, 2016

marcusrehm commented Dec 15, 2016

marcusrehm commented Dec 15, 2016

cuducos commented Dec 15, 2016 •

edited

marcusrehm commented Feb 3, 2017

cuducos commented Feb 3, 2017

marcusrehm commented Feb 5, 2017

martini97 commented Apr 28, 2017 •

edited

cuducos commented Apr 29, 2017 •

edited

marcusrehm commented Apr 29, 2017

cuducos commented Apr 29, 2017

willianpaixao commented Oct 4, 2018

cuducos commented Oct 4, 2018

Allow /src scripts to receive data files as command line arguments #167

Allow /src scripts to receive data files as command line arguments #167

Comments

Irio commented Dec 14, 2016

marcusrehm commented Dec 15, 2016

marcusrehm commented Dec 15, 2016

cuducos commented Dec 15, 2016 • edited

marcusrehm commented Feb 3, 2017

cuducos commented Feb 3, 2017

marcusrehm commented Feb 5, 2017

martini97 commented Apr 28, 2017 • edited

cuducos commented Apr 29, 2017 • edited

marcusrehm commented Apr 29, 2017

cuducos commented Apr 29, 2017

willianpaixao commented Oct 4, 2018

cuducos commented Oct 4, 2018

cuducos commented Dec 15, 2016 •

edited

martini97 commented Apr 28, 2017 •

edited

cuducos commented Apr 29, 2017 •

edited