Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Message is not extractable" errors from offspec SFTP servers (possibly all IBM) #194

Closed
muraleee opened this issue Aug 2, 2013 · 9 comments

Comments

@muraleee
Copy link

muraleee commented Aug 2, 2013

[Maintainer's note: a) this seems to be specific to IBM SFTP servers or others which place limits on how many times one can stat a file; b) there are two potential fixes for this, #562 and #579]

I have the following code with initalization of credentials removed.
Printing of directory listing works, however "get" fails with the following exception

paramiko [Errno 2] <....filename removed...> The message is not extractable!

I worked around by making a copy of getfo and removing prefetch call in it.


sftp = paramiko.SFTPClient.from_transport(t)

dirlist = sftp.listdir('.')
print "Dirlist:", dirlist

sftp.chdir('Inbox')
dirlist = sftp.listdir('.')
print "Dirlist:", dirlist

for fil in dirlist:
    fl = file(localpath + "/" + fil, 'wb')
    sftp.get(file, localpath + "/"+file)
    print fi
@c1b3rh4ck
Copy link

i can't reproduce that,Does anyone have the same trouble ?

@pswaminathan
Copy link

I am having the same trouble, with a GZipped file. Looking at other people on Stack Overflow, etc having this problem (https://groups.google.com/forum/#!msg/pysftp/ReEdm1tbFgs/OigomtMYtXAJ, http://stackoverflow.com/questions/18028440/paramiko-sftp-get-error), the problem seems like it might be with gzipped files (like the first), or binary files in general (I can't tell what type of file is in the second).

@bitprophet
Copy link
Member

Kinda-sorta related to #187 (though I think that is asking for a value-add, whereas this here implies simply downloading a gzipped file is outright broken.)

@pswaminathan
Copy link

After investigating this more, I don't think it's strictly a gzip-file issue. On the same server where I was getting that error on a gzip-file, I've also started getting it on text files. I'm getting the same issues when using PHP's standard SSH2 library as well. I have a feeling it's server-specific, and I'm working with that particular server admin to figure it out.

If/when we figure out what's going on, hopefully that'll lead us to a good check for it.

@bitprophet
Copy link
Member

See #530 which feels like the same issue, and indicates it may be a problem with open file handles on the server side.

@torkil
Copy link
Contributor

torkil commented Jul 21, 2015

I second that #530 feels like the same issue, and I think it's a case of paramiko not working well with IBM Sterling, most likely due to a bug/config issue. This may be related http://www-01.ibm.com/support/docview.wss?uid=swg1IT02065

Like others in this and related threads I have been able to get around it by modifying the getfo-call. The files I care about is only "extractable" once so right now I am out of files to fetch, but I will return to this later in the week.

@bufke
Copy link

bufke commented Oct 9, 2015

I had to comment out two lines to work around this. I assume this will break the callback feature.

class HorribleIBM_SFTPClient(paramiko.SFTPClient):
    def getfo(self, remotepath, fl, callback=None):
        with self.open(remotepath, 'rb') as fr:
            #file_size = self.stat(remotepath).st_size
            #fr.prefetch()
            size = 0
            while True:
                data = fr.read(32768)
                fl.write(data)
                size += len(data)
                if callback is not None:
                    callback(size, file_size)
                if len(data) == 0:
                    break
        return size

@bitprophet
Copy link
Member

@bufke Did you get a chance to try the changes from #562 or #579? They seem to be less-disruptive modifications achieving a similar end.

Rolling other open tickets (hi #576) into this one. Leaving both of those mentioned PRs open for now, they appear to accomplish the same goal in similar but distinct ways, need to take a closer look and compare pros/cons.

@bitprophet bitprophet changed the title Getting error on SFTP "Message is not extractable" errors from offspec SFTP servers (possibly all IBM) Nov 3, 2015
@bitprophet bitprophet added this to the 1.16 milestone Nov 3, 2015
@bitprophet
Copy link
Member

Comparing the two PRs:

  • First, in reading them and the current version of the code, I note the following things:
    • In a regular SFTPClient.get(path) call, stat is called not once, not twice, but three times!
      • Once within get() (but the result isn't used; guessing it's cruft.)
      • Once within getfo() (handed to the callback if callback is given)
      • Once within SFTPFile.prefetch() (which is always called, within getfo()) (within prefetch, the resulting file size is used to do the actual prefetching)
    • There is confusion as to the true cause of the errors in question, as seen below.
      • So either this is still masking multiple similar but distinct issues, or one of the PRs is "wrong" (tho I assume both authors tested their PRs, so...¯\_(ツ)_/¯)
  • Update sftp_client.py and sftp_file.py so that STAT on a file is only… #579 notices the multi-stat-calling I mentioned above, and updates getfo and prefetch so a single initial stat result is passed around.
    • It's not explicitly stated in the PR but this also happens to make get/getfo consistent with put/putfo, which apparently already did this in exactly the same fashion.
    • Unfortunately, the change is backwards incompatible re: getfo's signature, and getfo is technically public (I know some users use it directly instead of get) so we'd need to swap that around (and then it'll be inconsistent with putfo...something to probably fix in 2.0 if possible).
    • Further, the change means that users of getfo are now responsible for running the initial passed-in stat themselves if they want callback functionality to work correctly (& if they don't it's probably a confusing bug to troubleshoot since it just silently ends up 0).
      • Since the initial stat in get() is unused, I see no reason not to just push it into getfo anyways, which would fix this.
      • Because putfo already does this, anybody directly using it also has the same potential problem. (And we can probably fix that the same way as in previous bullet point.)
    • Not a fan of a few other minor things in that PR like the pointless log line :( but those can be easily tweaked.
  • Call stat before opening file for reading #562 seems to think the issue is the calling of stat within the open contextmanager, and opts to move the stat call in getfo above the contextmanager block.
    • It then, like Update sftp_client.py and sftp_file.py so that STAT on a file is only… #579, reuses that result inside prefetch.
    • It does not nix the useless stat in get(), but if its diagnosis is accurate (the problem is calling stat after open, not the number of stat calls) it's irrelevant.
    • This approach doesn't have the side effect of semi-unifying getfo and putfo but as stated earlier that's not actually possible w/o breaking backwards compat, so I'm going to ignore it for now; it's not a big deal.
  • Given all the above, I'm going to opt to start with Call stat before opening file for reading #562 and then modify it a bit to nuke the vestigial first stat.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants