Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Customise output names #16

Closed
davidverweij opened this issue May 4, 2020 · 10 comments
Closed

Customise output names #16

davidverweij opened this issue May 4, 2020 · 10 comments
Labels
enhancement New feature or request

Comments

@davidverweij
Copy link
Owner

davidverweij commented May 4, 2020

In my current use of the module, it would be particularly handy to have the output files be named using the data from the .csv. To illustrate, I am generating .docx that need to be sent to the recipients, whose name I fill in the template. In order to trace back which .docx should go to who, it would be convenient to allow parametric customisation of the output file names.

I am thinking of some kind of overloading (although I understand this is not intended in Python), or adding parameters - and checking their validity after opening the .csv. Then, we would definitely need to allow multiple values to ensure some type of uniqueness (and check for this too). Perhaps list parameter or alike?

E.g.

poetry run convert -t template.docx -c data.csv -n ["FIRSTNAME", "LASTNAME"]

With output files as:

john_doe.docx
john_doe_2.docx # duplicate
jane_doe.docx
...

Thoughts?

@davidverweij davidverweij added the enhancement New feature or request label May 4, 2020
This was referenced May 4, 2020
@jawrainey
Copy link
Collaborator

jawrainey commented May 4, 2020

In my current use of the module, it would be particularly handy to have the output files be named using the data from the .csv. To illustrate, I am generating .docx that need to be sent to the recipients, whose name I fill in the template. In order to trace back which .docx should go to who, it would be convenient to allow parametric customisation of the output file names.

Yes, being able to generate .docx named by a specific option would be great. Of course, that option must exist in the provided .csv so we will need to add validation. Therefore, the -n option must match row headers in the CSV. (maybe add that as the description?)

I am thinking of some kind of overloading (although I understand this is not intended in Python), or adding parameters - and checking their validity after opening the .csv. Then, we would definitely need to allow multiple values to ensure some type of uniqueness (and check for this too). Perhaps list parameter or alike?

We would need to update the variable passed to write, which is currently counter. The multiple values must exist in the csv used, so we would need to validate from the fieldnames. The single_document dict can be used to get the value we're interested in using for the filename (e.g. name). Before we start to enumerate the csvdict we could have a list named filenames and append to it with each loop and then use it as a lookup to ensure uniqueness prior to assigning the filename?

Can you think of a more elegant solution than creating a temporary list for comparison?

An alternative could be to update the specified column (lets say its name) prior to enumeration to better separate the logic, e.g. pass csvdict to some method which loops over the name column and updates the values in place (so if name david appears twice it will become david and david_2). Then we can replace counter with single_document[USER_OPTION] where USER_OPTION in this case is name.

@salmannotkhan
Copy link
Contributor

salmannotkhan commented May 4, 2020

i was working on this function and something strange is happening i successfully created that verification

the -n option must match row headers in the CSV

but after implementing that feature i can't traverse in csvdict?? i don't know why here is function:

def generate_names(listnm):
    newname = []
    for i in range(len(listnm)):
        if (listnm[i] not in listnm[:i]):
            newname.append(listnm[i])
        else:
            newname.append(listnm[i] + "_" + str(listnm[:i].count(listnm[i]) + 1))
    return newname

this is what i added in convert function:

if ((custom_name != None) and (custom_name not in csv_headers)):
     print("column name not found")
     exit()
else:
     file_names = generate_names(list(row[custom_name] for row in csvdict))

after this block i can't traverse in csvdict this function will return a list with names which we can access using: docx.write(f"{file_names[counter]}.docx") in the end

@jawrainey
Copy link
Collaborator

jawrainey commented May 4, 2020

@salmannotkhan -- feel free to make a draft pull request and I can have a look this evening (GMT+1) to try and understand the issue.

I suspect, although not certain, that the reason you cannot enumerate csvdict here is likely because you can only iterate over DictReader's once see here. The reason is because opening files using with statements makes use of generators. When you do list(row[custom_name] for row in csvdict) you're iterating over the open file and after that, it is closed within the context of the with statement.

To explore my hypothesis above, pass csvfile to your method and open it inside generate_items using a with statement? An alternative is to use seek(0) to use the same file ... but this feels like a hack.

@davidverweij
Copy link
Owner Author

I had this issue before - and I concur - it iterates using a reader, which is why the code originally opened the .csv twice (a crude fix I admit).

@salmannotkhan
Copy link
Contributor

Got it

@salmannotkhan
Copy link
Contributor

I'll try to implement the generate_names function inside output loop so we don't have to open file twice

@jawrainey
Copy link
Collaborator

jawrainey commented May 4, 2020

I'd suggest that instead to abstract the logic to a separate method above as it will make testing it easier. It also keeps the convert method clean and simple 👍

@salmannotkhan
Copy link
Contributor

done with this i used seek because i didn't found any other way

@jawrainey
Copy link
Collaborator

Great work -- if you make a PR I can test it and do a code review for you 👍

@salmannotkhan
Copy link
Contributor

yeah sure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants