Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Customize output name: round 2 #20

Closed
wants to merge 2 commits into from
Closed

Customize output name: round 2 #20

wants to merge 2 commits into from

Conversation

jawrainey
Copy link
Collaborator

@jawrainey jawrainey commented May 4, 2020

Branched from @salmannotkhan's PR (#19) to not include the commits that change the file structure. I have added a new commit to improve and simplify the logic for accepting cell names to use for naming files. We should not commit this to master until testing is in place, but once resolved it will close #13.

salmannotkhan and others added 2 commits May 4, 2020 22:20
now can use custom naming using -n  parameter
-n [column_name]
- Added types to method definitions
- Renamed variables/method to be more obvious
- Removed seek/readline and opened the CSV again to generate filenames
- Added `column_name` to CLI as parameter such that it will assign it to the variable param of the same name.
- Used relative import in CLI.
@jawrainey jawrainey mentioned this pull request May 4, 2020
@jawrainey jawrainey self-assigned this May 4, 2020
@salmannotkhan
Copy link
Contributor

salmannotkhan commented May 4, 2020

About that src folder I tried to use csv2docx but because so many files/folders have same name(parent directory, library folder, module itself) it was creating some kind of confusion and I wasn't able to set csv2docx as source folder that's why I had to change name

And

You said that
from csv2docx import convert
Will be better than relative import

@salmannotkhan
Copy link
Contributor

For that counter+1 I don't think end user will understand why file numbering starts at 0 instead of 1 that's why just for the sake of user understanding I did that

Copy link
Contributor

@salmannotkhan salmannotkhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guessed this method take more time than seek() that's why I went with seek()

@salmannotkhan
Copy link
Contributor

I'll try again tommorow to come up with better logic refactoring

@jawrainey
Copy link
Collaborator Author

jawrainey commented May 5, 2020

I'll try again tommorow to come up with better logic refactoring

I'd suggest that instead to read more on both type annotations and unit testing as that will be the main focus of this project in the coming weeks.

I guessed this method take more time than seek() that's why I went with seek()

@salmannotkhan -- you're right that the proposed changes will take more time than seek() (since it reopens a file of unknown size), but efficiency is unlikely going to be an issue. Using a separate method also means that we can test unique_filenames() in isolation whereas using seek() compounds the logic in convert(). Not currently an issue as the convert method is small, but as we add more features it will row in complexity -- making sure that does not happen early is important. This is of course open for debate 👍 Of course, adding this new method has its own problems: now a user of the module could import and use the method -- likely something we do not want.

For that counter+1 I don't think end user will understand why file numbering starts at 0 instead of 1 that's why just for the sake of user understanding I did that

Great point. I also think that outputting files based on a counter by default is probably not the best approach ... maybe we could instead force the user to choose a field they want to use as the output names by making column_name a required parameter? What do you think @davidverweij?

@salmannotkhan
Copy link
Contributor

What if we check the directory for the file and if the file exists we append filename with counter and avoid creating the filename list in advance

@salmannotkhan
Copy link
Contributor

salmannotkhan commented May 5, 2020

Here is what i come up with:

from glob import glob

def generate_name(filename):
    fullpath = "./" + filename + ".docx"
    filelist = glob("./" + filename + "*.docx")
    if fullpath in filelist:
        return f"{filename}_{len(filelist) + 1}.docx"
    return fullpath

we can pass this return as name for write function

@jawrainey
Copy link
Collaborator Author

jawrainey commented May 5, 2020

What if we check the directory for the file and if the file exists we append filename with counter and avoid creating the filename list in advance

I am not sure what you mean as the output file does not exist before running csv2docx, so there is no 'filename' from which we can generate_name. Creating the filename list in advance (see here) only occurs if column_name was provided as there's no way we can know what the user wants to call the output files. An alternative would be to use sensible defaults, but given we accept any .csv that may have diverse ranges of column names, then this is not possible.

My comment above was that instead of using a counter by default (e.g. 0-N.docx), we should make column required by default so the user has to pass column_name and therefore helps us define a sensible filename (as already implemented in this PR). This way the user will always have output files that make sense to them.

@salmannotkhan
Copy link
Contributor

salmannotkhan commented May 5, 2020

I am not sure what you mean as the output file does not exist before running csv2docx

what if user is running the script again with another .csv file that contains same name from previous csv file. I'm talking about that case

i took care of that issue also check my PR #22 if that's any good

@jawrainey
Copy link
Collaborator Author

This PR has been supplanted by #22

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add destination folder
2 participants