Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port to Python3 [GCI] #31

Closed
wants to merge 15 commits into from
Closed

Conversation

srevinsaju
Copy link
Member

@srevinsaju srevinsaju commented Dec 21, 2019

No description provided.

@quozl
Copy link
Contributor

quozl commented Dec 23, 2019

sgmllib was in Python 2, deprecated in Python 2.6, and removed in Python 3. Rather than add a dependency, the code should be ported to a library bundled with Python 3. Only one other Sugar activity is affected by this; Infoslicer.

Your exec line has a typo.

@srevinsaju
Copy link
Member Author

@quozl , I had thought of porting sgmllib, but however feedparser.py is upstream. It can be obrained from https://github.com/kurtmckee/feedparser
Should we port it to html.parser or lxml ?
It is upstream resource, comments mentioned in https://github.com/kurtmckee/feedparser README.md , sgmllib3k is a dependency. Alternatively, we can include the src of sgmllib3k which is ported to Python3.
If sgmllib.py and feedparser.py support is to be dropped, feedparser is to be removed, and other methods are to implemented.
I will make changes accordingly

@quozl
Copy link
Contributor

quozl commented Dec 24, 2019

Thanks for the research. I hadn't twigged that feedparser.py is embedded source.

Debian has a Python 3 package of Feedparser. I'm fine with using that. Debian does not have sgmllib as a package.

However, Fedora does not have a package for Feedparser on Python 3. Some happy Fedora maintainer would have to make a package first.

If you'd like to update the embedded Feedparser to Python 3, go ahead. You might look at the Debian package to see how they deal with the sgmllib no longer in Python 3.

@srevinsaju
Copy link
Member Author

srevinsaju commented Dec 24, 2019

@quozl Fedora gives a feedparser for python3 too
https://rpmfind.net/linux/rpm2html/search.php?query=python3-feedparser

sudo dnf install python3-feedparser
[srevinsaju@ss-fedora physics]$ sudo dnf install python3-feedparser
Last metadata expiration check: 0:24:52 ago on Tue 24 Dec 2019 11:35:47 AM +03.
Dependencies resolved.
=================================================================================================================================================================================================
 Package                                              Architecture                             Version                                            Repository                                Size
=================================================================================================================================================================================================
Installing:
 python3-feedparser                                   noarch                                   5.2.1-10.fc31                                      fedora                                   104 k

Transaction Summary
=================================================================================================================================================================================================
Install  1 Package

Total download size: 104 k
Installed size: 430 k
Is this ok [y/N]: y
Downloading Packages:
python3-feedparser-5.2.1-10.fc31.noarch.rpm                                                                                                                       92 kB/s | 104 kB     00:01    
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total                                                                                                                                                             54 kB/s | 104 kB     00:01     
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                                                                                                                                         1/1 
  Installing       : python3-feedparser-5.2.1-10.fc31.noarch                                                                                                                                 1/1 
  Running scriptlet: python3-feedparser-5.2.1-10.fc31.noarch                                                                                                                                 1/1 
  Verifying        : python3-feedparser-5.2.1-10.fc31.noarch                                                                                                                                 1/1 

Installed:
  python3-feedparser-5.2.1-10.fc31.noarch                                                                                                                                                        

Complete!
[srevinsaju@ss-fedora physics]$ python3
Python 3.7.5 (default, Oct 17 2019, 12:16:48) 
[GCC 9.2.1 20190827 (Red Hat 9.2.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import feedparser
>>> 

@quozl
Copy link
Contributor

quozl commented Dec 27, 2019

Okay, thanks. Can you use that and keep it embedded?

@srevinsaju
Copy link
Member Author

srevinsaju commented Dec 27, 2019

@quozl, like keep an internal copy also?
do you mean like , use the packaged-feed parser as well as the file. which on should I remove

@quozl
Copy link
Contributor

quozl commented Dec 27, 2019

Can you include feedparser in the activity, using the Python 3 version of feedparser you have found?

@srevinsaju
Copy link
Member Author

@quozl should I include the source code of feedparser in the activity?

@quozl
Copy link
Contributor

quozl commented Dec 27, 2019

Yes, that's what we have now, in the master branch, isn't it?

@srevinsaju
Copy link
Member Author

No, we are having a python2 version of it. Feedparser is no longer a single file, it has turned into a heavy package folder

@quozl
Copy link
Contributor

quozl commented Dec 30, 2019

Not yet ready to merge;

  • adds a new dependency,
  • upstream OPDS specification has adopted JSON.

@srevinsaju
Copy link
Member Author

srevinsaju commented Dec 30, 2019

Until its confirmed that OPDS has officially supported 2.0, and as python2 is getting deprecated in a few days,
451afdb will help to give python3 support for the activity.
This is a temporal fix, for anyone who may wish to use this in python3

EDIT:
As feedbooks have mentioned using atom in the api http://www.feedbooks.com/api, opds 1.2 is assumed to be officially supported.

@quozl
Copy link
Contributor

quozl commented Dec 30, 2019

Thanks. We have to be more forward looking than current support statements. I know feedbooks.com do support OPDS 1.2. So #32 says "determine if feedbooks.com and archive.org can support the new protocol".

We also have to be more forward looking than current deprecation statements. Although Python 2 deprecation has been announced and reinforced several times, we've often seen deprecations not take effect because there has been either (a) not enough people moving to the new version or API, or (b) enough people suddenly appear to provide ongoing support of the old version or API. So I'm not swayed by repeated mentions of deprecation. Python 2 is just a C program, after all. I'm fairly sure Red Hat will be supporting Python 2 for enterprise customers, as will Canonical. Consequently their release of source code may facilitate derivatives such as CentOS holding on to Python 2 for longer than the Python Foundation expects or cares about.

But thanks for adding sgmllib. That will get us moving on this pull request.

@quozl
Copy link
Contributor

quozl commented Jan 2, 2020

Thanks. Tested;

  • search for pride prejudice in feedbooks, selected first match, and downloaded an EPUB,
  • switch to internet archive; cursor did spin and nothing else happened; logs contained;
Traceback (most recent call last):
  File "/usr/share/sugar/activities/GetBooks.activity/opds.py", line 468, in __finished_cb
    next(reader)  # skip the first header row.
_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)
  • restarted and tried internet archive first, same result.

Hope that helps!

@srevinsaju
Copy link
Member Author

@quozl, I fixed the bytes error on 2f8ce18, but the internet archive is not loading however. It gives the error failed to download the list. check if it works before porting, probably the providers have changed their API, as get-books cannot access them. Thanks

@quozl
Copy link
Contributor

quozl commented Jan 2, 2020

I've checked as of 03e4b94 here and I agree no results are shown. I've checked master, and it does show results. I understand you don't have a Python 2 environment working? Can we help with that? Or, please add logging if you need me to test what happens. Or give instructions for me to use Pdb.

@srevinsaju

This comment has been minimized.

@srevinsaju
Copy link
Member Author

srevinsaju commented Jan 2, 2020

This is probably what happens on arch linux whch has completely changed all their build headers from python2 to Python3 .
see https://unix.stackexchange.com/questions/104288/missing-python-h-in-arch

And this missing Python2 headers are also a fact of wikipedia-activity which compiled on a Ubuntu WSL Subsystem today, but not on my Arch ❓

Running sugar-activity instead of sugar-activity3 gives

 ss@srevinsaju  ~/repo/get-books-activity   python3-ss  sugar-activity   
Traceback (most recent call last):
  File "/usr/local/bin/sugar-activity", line 3, in <module>
    from sugar3.activity import activityinstance
ImportError: No module named sugar3.activity

This is a faulty packages of 0.114 sugar-toolkit-gtk3 (based on python2 patched to work on arch) is causing conflicts with my AUR (sugar3-toolkit-gtk3) (python3)

@quozl
Copy link
Contributor

quozl commented Jan 2, 2020

@srevinsaju, the arch install failure is because you have already installed Sugar by hand or using your own packages. The install of sugar-toolkit failed but you don't need sugar-toolkit, as that is the GTK 2 Python 2 toolkit. GetBooks is currently GTK 3 Python 2. Install the sugar-toolkit-gtk3 repository source instead, or change your packaging of sugar-toolkit-gtk3 to include Python 2 in the build steps. https://github.com/sugarlabs/sugar/blob/master/docs/development-environment.md#native-sugar shows how to build by hand. Or, if you don't want to disrupt your working system, use a VM of Sugar Live Build, which includes Python 2 support.

@srevinsaju

This comment has been minimized.

@quozl
Copy link
Contributor

quozl commented Jan 2, 2020

"make install" looked successful. Where was the sugar3 module installed? Import it from there? See the native install instructions for how sometimes the system path doesn't include what is used by the default install.

@srevinsaju
Copy link
Member Author

Oh, I totally forgot about it. Let me check it out :) Thanks @quozl

@srevinsaju
Copy link
Member Author

srevinsaju commented Jan 2, 2020

@quozl I need assistance, python2.7 doesn't exist o_O.
Doing rechecks

$ ls /usr/local/lib/python2.7/site-packages
ls: cannot access '/usr/local/lib/python2.7/site-packages': No such file or directory
 ✘ $ ls /usr/local/lib/python2.7              
ls: cannot access '/usr/local/lib/python2.7': No such file or directory
 ✘ $ ls /usr/local/lib                        
aspell-0.60       libaspell.so         libpspell.so                 libsugar-eventcontroller.so        libsugarext.so        libsugarrunner.so
girepository-1.0  libaspell.so.15      libpspell.so.15              libsugar-eventcontroller.so.0      libsugarext.so.0      libsugarrunner.so.0
gtk-2.0           libaspell.so.15.3.1  libpspell.so.15.3.1          libsugar-eventcontroller.so.0.0.0  libsugarext.so.0.0.0  libsugarrunner.so.0.0.0
libaspell.la      libpspell.la         libsugar-eventcontroller.la  libsugarext.la                     libsugarrunner.la     python3.8
$ ls /usr/lib/python2.7/dist-packages | grep sugar
ls: cannot access '/usr/lib/python2.7/dist-packages': No such file or directory
 ✘ $ ls /usr/lib/python2.7/site-packages | grep sugar
 ✘ $ ~/repo/FractionBounce   flatpak-info  

@quozl
Copy link
Contributor

quozl commented Jan 2, 2020

I've no idea about flakpak, sorry. I don't know what that output means. You do need a version of Python 2 installed and working in order to install the Sugar Toolkit for GTK 3.

@Aniket21mathur
Copy link
Contributor

Your traceback shows you have cloned the activity inside another.

Ohh! Thanks for pointing out my mistake.

@srevinsaju
Copy link
Member Author

@quozl, @Aniketh21Mathur, the cursor is just spinning but the activity works, just try clicking 'By Title or any other component in the tree view after the internet archive option. I think the cursor spin was not reset. I will try to fix it today

@srevinsaju
Copy link
Member Author

srevinsaju commented Jan 23, 2020

@quozl @Aniket21mathur I have fixed the errors. they were actually not errors, but rather the wait cursor was not reset to the original cursor once we press the Internet archive option. I was able to recreate the error by @Aniket21mathur which gave a TypeError, which I fixed it on cec13b9 and the logging was updated on 253cb14. The bozo_exception should not pass simply if it ever happens in future

@quozl
Copy link
Contributor

quozl commented Jan 24, 2020

Thanks. I've tested 253cb14. I switched to Internet Archive, then typed pride and prejudice into the search, and pressed enter. The cursor began to spin. The message Performing lookup, please wait... did appear. I waited ten seconds. Nothing else happened. No logs.

I repeated the test with network packet tracing on my firewall system;

# tcpdump -s 0 -w tmp -i eth1 host $IP

Connection was opened by the app at 13:12:04.944038 and at 13:12:05.148803 a request sent to www.archive.org over HTTP, a GET of the URI

/advancedsearch.php?q=%28title%3A%28pride%20and%20prejudice%29%20OR
%20creator%3A%28pride%20and%20prejudice%29%29%20AND
%20format%3A%28DJVU%29&
fl%5B%5D=creator&fl%5B%5D=description&fl%5B%5D=format&fl%5B%5D=identifier&fl%5B%5D=language&fl%5B%5D=publisher&fl%5B%5D=title&fl%5B%5D=volume&
sort%5B%5D=title&sort%5B%5D&sort%5B%5D=&rows=500&save=yes&
fmt=csv&xmlsearch=Search

Response was sent by the server at 13:12:05.707975 of content-type text/csv.

Connection was closed by the server normally at 13:12:06.676381 along with the last data segment, total data size of 144180 bytes. The data seemed okay, but was not displayed.

Connection was closed by the client in response at 13:12:06.916956

All up no more than three seconds for query and data transfer.

Hope that helps!

Internet Archive Query Class used return in the for loop. However, this does not allow the calling of
update_cb which occurs after the for loop is over. The logic had to be altered to use break instead of
return, so that the last update callback is called"
@srevinsaju
Copy link
Member Author

Yes @quozl your debugging helped me solve that issue too. I found out the Url was valid, and so was the csv file. However, due to a bad logic in the forloop, the QueryResults returned None instead of the updated_cb. Later update signal was not connected in the other GetIABooksActivity.py, so that was also fixed. Thanks for the help

@quozl
Copy link
Contributor

quozl commented Jan 24, 2020

Thanks. Tetsed 0ea5a0c. Does display search results. Some of the search results can be downloaded. Some cannot. An error about the URL was displayed in an Alert. Logs contained "1579890286.778513 ERROR root: internet archive file list omits content type"

Any idea why some cannot be downloaded? Is it something the activity can adapt to?

@srevinsaju
Copy link
Member Author

@quozl I will check where the the LOG is emitted,

@srevinsaju
Copy link
Member Author

srevinsaju commented Jan 24, 2020

Yes, the log is created in line 408, opds.py, which has a fixme # FIXME: report to user a failure to find matching content
The error is created, but however not reported to the user

I guess the internet archive didn't have the URL you were searching for, but its just an assumption. Did you try copy pasting the URL which it attempts to download, and heck manually, or you can send me which book you were trying to download, so I can debug it. Thanks

@srevinsaju
Copy link
Member Author

@quozl Requesting review

@quozl
Copy link
Contributor

quozl commented Jan 27, 2020

Nothing has changed? 0ea5a0c review comments above; some of the search results cannot be downloaded. The activity needs to do better at handling this before I would release it.

@srevinsaju
Copy link
Member Author

@quozl if it important to notify the user, I would have to change the code at least 25% to include parent caller. This function was not working since before, but I shall workaround the issue then. Thanks

@quozl
Copy link
Contributor

quozl commented Jan 27, 2020

It certainly is important. Software should always advise the user when it cannot perform a requested operation. The advice might not need to be detailed.

Did you try copy pasting the URL which it attempts to download, and heck manually, or you can send me which book you were trying to download, so I can debug it. Thanks

As I said above, my test was to type "pride and prejudice" into the search box and then try downloading from the various result lines.

@srevinsaju
Copy link
Member Author

@quozl I have attempted a fix here, tried searching Pride and Prejudice now it shows an alert. Hope that would do.

@quozl
Copy link
Contributor

quozl commented Jan 28, 2020

Thanks. Tested f46f676.

  • select Internet Archive,
  • search for "pride and prejudice",
  • click on the first result "0918:- Pride And Prejudice" and wait for cover download,
  • click on Get Book and wait for download,
  • click on the second result "119 2014 04 09 Pride And Prejudice" and wait for cover download,
  • click on Get Book and wait.

It doesn't seem to start. There's a flicker of the list, that's about it. Repeatable failure; the second result doesn't download, and no explanation.

Some errors in log too; but not always happening;


Traceback (most recent call last):
  File "/usr/lib/python3.7/dist-packages/sugar3/network.py", line 291, in _read_next_chunk
    self.cleanup()
  File "/usr/lib/python3.7/dist-packages/sugar3/network.py", line 306, in cleanup
    os.close(self._outf)
TypeError: an integer is required (got type NoneType)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/gi/overrides/GLib.py", line 662, in <lambda>
    func_fdtransform = lambda _, cond, *data: callback(channel, cond, *data)
  File "/usr/lib/python3.7/dist-packages/sugar3/network.py", line 295, in _read_next_chunk
    self.cleanup(remove=True)
  File "/usr/lib/python3.7/dist-packages/sugar3/network.py", line 306, in cleanup
    os.close(self._outf)
TypeError: an integer is required (got type NoneType)
Traceback (most recent call last):
  File "/usr/lib/python3.7/dist-packages/sugar3/network.py", line 291, in _read_next_chunk
    self.cleanup()
  File "/usr/lib/python3.7/dist-packages/sugar3/network.py", line 306, in cleanup
    os.close(self._outf)
TypeError: an integer is required (got type NoneType)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/gi/overrides/GLib.py", line 662, in <lambda>
    func_fdtransform = lambda _, cond, *data: callback(channel, cond, *data)
  File "/usr/lib/python3.7/dist-packages/sugar3/network.py", line 295, in _read_next_chunk
    self.cleanup(remove=True)
  File "/usr/lib/python3.7/dist-packages/sugar3/network.py", line 306, in cleanup
    os.close(self._outf)
TypeError: an integer is required (got type NoneType)

@srevinsaju
Copy link
Member Author

@quozl I wasn't able to reproduce your error, what I see is this
image
I will try to workaround the log which you're getting, but idk why. I am not getting it. Tested on Sugar 0.116 Live Build and Sugar DE, no errors or logs

@srevinsaju
Copy link
Member Author

Ok, with 37b7225, I was able to reproduce the error, but I failed to find a fix to this issue. 😢

@quozl
Copy link
Contributor

quozl commented Jan 29, 2020

Thanks. I suggest adding more debug messages to help us find the problem. It may be in the Sugar Toolkit. When I try GetBooks v19 (Python 2) on Sugar 0.112 (Python 2), notable differences are;

  • more of the results can be downloaded,
  • the list becomes visibly insensitive during download.

You might also have a filtered internet connection. That's where debug messaging or tcpdump will help.

@srevinsaju
Copy link
Member Author

@quozl I may not be able to fix it with the current knowledge I have on G** modules, but I may give some information that I found from debugging, that may help other contributors in future:

Possible Fixes:

  • Traceback error from where the URL was not allowed to download or return the HTTP error code
  • Add a timeout for download internet archive

You may close this PR if you prefer, well, I guess I am stuck with this now.

@quozl
Copy link
Contributor

quozl commented Feb 3, 2020

Thanks. I'll keep it open, to inform the next person who tries to port. It may eventually be me doing it. The class GlibURLDownloader is part of the Sugar Toolkit, and so is our responsibility at Sugar Labs.

The activities that use it are GetBooks, Read, ReadETexts, ShowNTell, and ViewSlides. Sugar itself has jarabe.util.Downloader. None of these have been as extensively tested on Python 3 as GetBooks has been in this pull request. That's the basis of suspicion. I've not debugged it yet.

@srevinsaju
Copy link
Member Author

Ok thanks @quozl for the review.

@quozl quozl added this to Critical Issues in Port to Python 3 via Six Feb 25, 2020
@quozl quozl mentioned this pull request Mar 9, 2020
@quozl
Copy link
Contributor

quozl commented Mar 10, 2024

No longer current.

@quozl quozl closed this Mar 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

None yet

3 participants