Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Might be nice to always call exporter.finish_exporting #5537

Closed
romanchyla opened this issue Jun 17, 2022 · 9 comments · Fixed by #5758
Closed

Might be nice to always call exporter.finish_exporting #5537

romanchyla opened this issue Jun 17, 2022 · 9 comments · Fixed by #5758

Comments

@romanchyla
Copy link

Description

When an exception happens inside Exporter (for whatever reason) - itemcount never gets increased (that's OK). But then during closing, the code doesn't get chance to act

This is the place in the code:

https://github.com/scrapy/scrapy/blob/master/scrapy/extensions/feedexport.py#L357

Perhaps the lines 357 and 353 should have been swapped?

Additional context

I'm using slot.storage to open a database; and an exporter to start a session -- store() gets called (to close connection to the database) but session didn't have chance to commit -- yes, there is nothing to commit, but still - given the semantics (also visible from the code -- when file is supposed to be a file descriptor it might be worth consideration to always call finish_exporting to give the file descriptor chance to react before it is closed by the parent object)

necessary caveat: I might be mis-using the API and it is a minor detail, so please forgive if this looks silly (and feature not a bug)

@wRAR
Copy link
Member

wRAR commented Jul 11, 2022

Without looking at it for enough time, it looks like we should call finish_exporting if we called start_exporting before (not always, though), if we want to support various protocols and formats. The example I'm usually using when thinking about these things is an XML file that needs a closing tag if there was an opening one.

@HatimZ
Copy link

HatimZ commented Oct 27, 2022

Can I solve this issue. Looks like just need to add the 'slot.finish_exporting()' before the if condition in the 'def _close_slot(self, slot, spider)'

@wRAR
Copy link
Member

wRAR commented Oct 27, 2022

@HatimZ probably, but you need to make sure this code isn't called if slot.start_exporting() wasn't called (by reading the code, or maybe by making some test case). You'll also need to add some new test to test this behavior.

@HatimZ
Copy link

HatimZ commented Oct 28, 2022

@wRAR .Okay. Are there any examples/resources for writing test cases?

@wRAR
Copy link
Member

wRAR commented Oct 28, 2022

If you mean a test case for making sure it's ohly called when start_exporting() was called it would just be a test spider with some exporter enabled and some different edge cases like zero items, maybe an exception happening at some early point etc.

@HatimZ
Copy link

HatimZ commented Oct 28, 2022

I checked the code to make sure that slot.finish_exporting() isn't called if slot.start_exporting() wasn't called and I have come to the conclusion that indeed its not happening anywhere in the file .
Even if slot.finish_exporting() was called without the start being called, it wouldn't do anything because it checks for the flag/boolean set by start. Hence even if it gets called , it wont do any harm. Am I right?
So Can I proceed to do what I suggested here ?

Can I solve this issue. Looks like just need to add the 'slot.finish_exporting()' before the if condition in the 'def _close_slot(self, slot, spider)'

@HatimZ
Copy link

HatimZ commented Oct 28, 2022

Also if test will be required can you point me to an already written test case, which I can use as an example. I am a beginner hence need some help. Thanks.

@HatimZ
Copy link

HatimZ commented Oct 30, 2022

@wRAR . If I can solve this and you can merge it until Monday, it will be counted towards hacktoberfest.

@wRAR
Copy link
Member

wRAR commented Oct 31, 2022

Even if slot.finish_exporting() was called without the start being called, it wouldn't do anything because it checks for the flag/boolean set by start. Hence even if it gets called , it wont do any harm. Am I right?

Yes, this sounds correct.

if test will be required can you point me to an already written test case, which I can use as an example.

I don't have one handy, but it looks like tests/test_exporters.py is a place with a lot of tests of the related code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants