-
Couldn't load subscription status.
- Fork 2
In 924 fix ftp file transfer method #114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Why these changes are being introduced: * The ftplib.storbinary method throws errors that suggest file transfer has failed even when the Carbon run successfully uploads an XML file to the Elements FTP server. For the 'people' feed, it would log: ftplib.error_temp: 425 Error while transfering data: ECONNABORTED - Connection aborted For the 'articles' feed, it would log: TimeoutError: The read operation timed out These changes introduce cleaner error handling that avoid the error for the 'people' feed and except' timeout errors that do not actually seem to have a negative impact. How this addresses that need: * Remove custom storbinary method to resolve errors for 'people' feed * Add try-except-finally code block to cleanly handle timeout errors for 'articles' feed Side effects of this change: * None Relevant ticket(s): * https://mitlibraries.atlassian.net/jira/software/c/projects/IN/boards/70?modal=detail&selectedIssue=IN-924
4927e98 to
05b509f
Compare
|
@ghukill @ehanson8 I made a small update to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Thanks for the detailed PR and confluence document.
Can confirm from pairing that debugging this was difficult and involved. Because it appears the XML file is getting written successfully, the try / except TimeOut / except <generic> / finally block makes sense to me. And, with the logging in place, over time we could see how many times that timeout warning is thrown.
My two cents is that introducing workarounds to avoid the timeout on ftps.quit() may not be worth the complexity it introduces, given the "piping" nature of this application.
|
This is a very thorough explanation of the issue which is useful for an outsider, a good template for future PRs detailing a complicated situation |
| **Note**: As of this writing, Apple M1 Macs cannot run Oracle Instant Client. | ||
| * If you are on a machine that can run Oracle Instant Client, follow the steps outlined in [Without Docker](#without-docker). | ||
|
|
||
| The data retrieved by the Carbon application contains personally identifiable information (PII), so downloading the files is not recommended. However, if examining the files created by Carbon is **absolutely necessary** for testing purposes, this can be done on your local machine via a Docker container. For more information, please refer to the Confluence document: [How to download files from an application that connects to the Data Warehouse](https://mitlibraries.atlassian.net/wiki/spaces/~712020c6fcb37f39c94e54a6bfd09607f29eb4/pages/3469443097/Running+applications+in+a+local+Docker+Container). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great warning that I should reuse in alma-patronload
|
Thank you both for your assistance and guidance through this exploration! |
What does this PR do?
The work in this PR is to investigate the 'errors' with the
ftplib.storbinarymethod that are logged for the 'people' feed (described in this Github issue) and the 'articles' feed (described in this Jira ticket].Here are some key learnings from this effort:
Articles feed
storbinary()is running (it takes >5 minutes for the Article, it is 'blocking' any other commands from being run on the FTP server. This leads to an 'idle' FTP connection. As described in this thread:CarbonFtpsTls.timeoutdoes not seem to have an effect. It just delays the logging of theTimeoutError.TimeoutErrorhas not shown to be an issue and will still result in an XML file uploaded to the Elements FTP server.People feed
FTPS_TLS.storbinary()method.Helpful background context
While looking into this issue, @ghukill came across this thread, which we assume is the reason for previously overwriting the
storbinary()command. However, as stated previously, using the defaultFTPS_TLS.storbinary()method results in the same XML file while avoiding the error for the 'people' feed.How can a reviewer manually see the effects of these changes?
Compare logs from different code iterations, and verify that latest code enables successful Carbon runs with improved logging
Produced by code in
main:People feed

Articles feed

Produced by Commenting out CarbonFtpsTls.storbinary():
People feed

Articles feed

Code in this branch
People feed

Articles feed

Compare XML files produced with latest code (excludes custom
CarbonFtpsTls.storbinaryagainst XML files produced with current code inmain)diffshows that comparing the pairs of files (original vs. latest) have the same bytes/characters.Includes new or updated dependencies?
NO
Developer
Code Reviewer
(not just this pull request message)