Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs: From forms.FileField() to ExcelInMemoryUploadedFile ? #41

Closed
guettli opened this issue Oct 26, 2017 · 7 comments
Closed

Docs: From forms.FileField() to ExcelInMemoryUploadedFile ? #41

guettli opened this issue Oct 26, 2017 · 7 comments

Comments

@guettli
Copy link

guettli commented Oct 26, 2017

I do not understand this:

UploadFileForm is html widget for file upload form in the html page. Then look down at filehandle. It is an instance of either ExcelInMemoryUploadedFile

Source: http://django-excel.readthedocs.io/en/latest/

You use an ordinary forms.FileField(). I guess the matching config in settings.py is needed.

Please make this part more newbee friendly.

What could be added here?

UploadFileForm is html widget for file upload form in the html page. *** here ** Then look down at filehandle. It is an instance of either ExcelInMemoryUploadedFile

@chfw
Copy link
Member

chfw commented Oct 26, 2017

OK. I will elaborate more on that in the doc. Thanks for your comments.

In this section, forms.FieldField() tells the UploadFileForm that I will need a form field that accepts a file upload.

And UploadFileForm can be rendered by Django as a html form on your web page and the form instance can also be used to do form validation in the server side.

@guettli
Copy link
Author

guettli commented Oct 27, 2017

Don't get me wrong. I just don't know what big benefit django-excel provides.

I use pyexcel like this now:

class FooForm(forms.Form):
    spreadsheet = forms.FileField()

    def save(self):
        temporary_uploaded_file = self.cleaned_data['spreadsheet']
        for row_dict in pyexcel.iget_records(file_stream=temporary_uploaded_file,
                                 file_type=temporary_uploaded_file.name.split('.')[-1]):
            do_foo(row_dict['my_col'])

Why should I use django-excel? Why is it better then the way I do?

I like it reusable. I don't like it, if I need to change the config like settings.FILE_UPLOAD_HANDLERS for one app. I run several apps in my project. What should I do, if app Foo wants Foo-Upload-Handler and app Bar wants Bar-Upload-Handler?

@chfw
Copy link
Member

chfw commented Oct 27, 2017

I like your critical thinking. Here is my rationale for django-excel.

django-excel provide two handlers in order to save the time that is to be spent on saving the uploaded file to file system and reading it back for further processing. In particular, MemoryFileUploadHandler would hold the uploaded file in memory. The idea is that ExcelMemoryFileUploadHandler can read excel data from the memory buffer already without saving it to file system first.

Unless you have other means of obtained the uploaded file more efficiently, you can ignore django-excel.

@guettli
Copy link
Author

guettli commented Oct 28, 2017

Thank you for your feedback. And thank you for liking my critical thinking.

Above I provided the lines which I use now.

Do you think there is a more efficient way to use pyexcel and django (without django-excel)?

@chfw
Copy link
Member

chfw commented Oct 28, 2017

Apart from handling file uploads, you can use pyexcel the way you wanted(I mean without using django-excel). The main challenge for pyexcel is how to read and write data quickly. The performance of pyexcel differs in handling data under 1MB, and hundreds of MB, up to multiple GBs. Here is the list of things that I found in configure pyexcel plugins to cope with big data sets: http://pyexcel.readthedocs.io/en/v0.6.0/iodrivers.html#read-and-write-with-performance.
and here:
http://pyexcel.readthedocs.io/en/v0.6.0/two-liners.html

@chfw
Copy link
Member

chfw commented Nov 3, 2017

And when django-excel tries to save the uploaded data into django models, it will use bulk_insert(which is much quicker than saving one by one). However, to be able to use bulk_insert, the developer need to ensure the uploaded data will not cause database exceptions upon db-insersion, typical ones are: database integrity error.

@chfw
Copy link
Member

chfw commented Mar 15, 2018

9b79176

@chfw chfw closed this as completed Mar 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants