Customize output name: round 2 #20

jawrainey · 2020-05-04T22:30:32Z

Branched from @salmannotkhan's PR (#19) to not include the commits that change the file structure. I have added a new commit to improve and simplify the logic for accepting cell names to use for naming files. We should not commit this to master until testing is in place, but once resolved it will close #13.

now can use custom naming using -n parameter -n [column_name]

- Added types to method definitions - Renamed variables/method to be more obvious - Removed seek/readline and opened the CSV again to generate filenames - Added `column_name` to CLI as parameter such that it will assign it to the variable param of the same name. - Used relative import in CLI.

salmannotkhan · 2020-05-04T22:52:34Z

About that src folder I tried to use csv2docx but because so many files/folders have same name(parent directory, library folder, module itself) it was creating some kind of confusion and I wasn't able to set csv2docx as source folder that's why I had to change name

And

You said that
from csv2docx import convert
Will be better than relative import

salmannotkhan · 2020-05-04T22:54:01Z

For that counter+1 I don't think end user will understand why file numbering starts at 0 instead of 1 that's why just for the sake of user understanding I did that

salmannotkhan

I guessed this method take more time than seek() that's why I went with seek()

salmannotkhan · 2020-05-04T23:11:23Z

I'll try again tommorow to come up with better logic refactoring

jawrainey · 2020-05-05T15:12:52Z

I'll try again tommorow to come up with better logic refactoring

I'd suggest that instead to read more on both type annotations and unit testing as that will be the main focus of this project in the coming weeks.

I guessed this method take more time than seek() that's why I went with seek()

@salmannotkhan -- you're right that the proposed changes will take more time than seek() (since it reopens a file of unknown size), but efficiency is unlikely going to be an issue. Using a separate method also means that we can test unique_filenames() in isolation whereas using seek() compounds the logic in convert(). Not currently an issue as the convert method is small, but as we add more features it will row in complexity -- making sure that does not happen early is important. This is of course open for debate 👍 Of course, adding this new method has its own problems: now a user of the module could import and use the method -- likely something we do not want.

For that counter+1 I don't think end user will understand why file numbering starts at 0 instead of 1 that's why just for the sake of user understanding I did that

Great point. I also think that outputting files based on a counter by default is probably not the best approach ... maybe we could instead force the user to choose a field they want to use as the output names by making column_name a required parameter? What do you think @davidverweij?

salmannotkhan · 2020-05-05T17:56:27Z

What if we check the directory for the file and if the file exists we append filename with counter and avoid creating the filename list in advance

salmannotkhan · 2020-05-05T18:25:03Z

Here is what i come up with:

from glob import glob

def generate_name(filename):
    fullpath = "./" + filename + ".docx"
    filelist = glob("./" + filename + "*.docx")
    if fullpath in filelist:
        return f"{filename}_{len(filelist) + 1}.docx"
    return fullpath

we can pass this return as name for write function

jawrainey · 2020-05-05T18:58:59Z

What if we check the directory for the file and if the file exists we append filename with counter and avoid creating the filename list in advance

I am not sure what you mean as the output file does not exist before running csv2docx, so there is no 'filename' from which we can generate_name. Creating the filename list in advance (see here) only occurs if column_name was provided as there's no way we can know what the user wants to call the output files. An alternative would be to use sensible defaults, but given we accept any .csv that may have diverse ranges of column names, then this is not possible.

My comment above was that instead of using a counter by default (e.g. 0-N.docx), we should make column required by default so the user has to pass column_name and therefore helps us define a sensible filename (as already implemented in this PR). This way the user will always have output files that make sense to them.

salmannotkhan · 2020-05-05T20:18:48Z

I am not sure what you mean as the output file does not exist before running csv2docx

what if user is running the script again with another .csv file that contains same name from previous csv file. I'm talking about that case

i took care of that issue also check my PR #22 if that's any good

jawrainey · 2020-05-05T21:54:03Z

This PR has been supplanted by #22

salmannotkhan and others added 2 commits May 4, 2020 22:20

Customize output name

4e006a9

now can use custom naming using -n parameter -n [column_name]

jawrainey mentioned this pull request May 4, 2020

Customize output name #19

Closed

jawrainey self-assigned this May 4, 2020

salmannotkhan reviewed May 4, 2020

View reviewed changes

jawrainey mentioned this pull request May 5, 2020

Added unique naming for output files #22

Merged

jawrainey closed this May 5, 2020

jawrainey deleted the pr/19 branch May 5, 2020 21:54

davidverweij mentioned this pull request May 6, 2020

Destination Output and Unit Testing #18

Closed

davidverweij added a commit that referenced this pull request May 6, 2020

Merge branch 'salmannotkhan-master' resolving #16 #13 #18 #20

983f9d6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Customize output name: round 2 #20

Customize output name: round 2 #20

jawrainey commented May 4, 2020 •

edited

Loading

salmannotkhan commented May 4, 2020 •

edited

Loading

salmannotkhan commented May 4, 2020

salmannotkhan left a comment

salmannotkhan commented May 4, 2020

jawrainey commented May 5, 2020 •

edited

Loading

salmannotkhan commented May 5, 2020

salmannotkhan commented May 5, 2020 •

edited by jawrainey

Loading

jawrainey commented May 5, 2020 •

edited

Loading

salmannotkhan commented May 5, 2020 •

edited

Loading

jawrainey commented May 5, 2020

Customize output name: round 2 #20

Customize output name: round 2 #20

Conversation

jawrainey commented May 4, 2020 • edited Loading

salmannotkhan commented May 4, 2020 • edited Loading

salmannotkhan commented May 4, 2020

salmannotkhan left a comment

Choose a reason for hiding this comment

salmannotkhan commented May 4, 2020

jawrainey commented May 5, 2020 • edited Loading

salmannotkhan commented May 5, 2020

salmannotkhan commented May 5, 2020 • edited by jawrainey Loading

jawrainey commented May 5, 2020 • edited Loading

salmannotkhan commented May 5, 2020 • edited Loading

jawrainey commented May 5, 2020

jawrainey commented May 4, 2020 •

edited

Loading

salmannotkhan commented May 4, 2020 •

edited

Loading

jawrainey commented May 5, 2020 •

edited

Loading

salmannotkhan commented May 5, 2020 •

edited by jawrainey

Loading

jawrainey commented May 5, 2020 •

edited

Loading

salmannotkhan commented May 5, 2020 •

edited

Loading