pyoai health / maintenance status? #43

interrogator · 2020-04-24T17:08:27Z

This project seems to have fallen into a state of semi-disrepair, no recent commits in master, PRs unmerged, and so on. That said, there are few Python OAI-PMH resources, so it looks like this one does still see some use, and could be important for the community.

Looking at forks of the project, I note that none is expanding dramatically on pyoai, more just little fixes and customisations. So by looking at the repo, there is also the chance that the project is simply stable and does what it needs to. Or, erm, it's essentially abandoned without clarification from maintainers.

Do the authors want to make a comment regarding current health of the project? If the project is no longer properly maintained, is there a successor project, advice, etc?

From what I can tell, one viable alternative is the sickle project, OAI-PMH for humans, also Python: https://github.com/mloesch/sickle. Have any users switched from one to the other? Any experiences they'd like to share? This project too has some concerns regarding maintenance status.

Would really appreciate a bit of clarification here, as it doesn't seem totally abandoned, but for example, there's a PR that's ready to go, but been waiting a year: #39 ...

The text was updated successfully, but these errors were encountered:

wetneb · 2020-04-24T17:20:26Z

I have kindly been given write access to the repository, so I could help with merging PRs, but I don't think I can upload releases to PyPI.
Personally I am not using this project anymore but happy to help if needed.
sickle does look interesting, and actively maintained - so if you are not relying on pyoai yet, I think going for sickle would be wise.

jascoul · 2020-04-24T17:22:37Z

Hi Daniel,

I can see where you're coming from, there is not a lot going on in the codebase.
This is mostly because the OAIPMH standard has not changed in many years, and this implementation is fairly complete. I would say the python3 support is the last major addition.

You are right that there are some long standing merge requests. It's hard for me to find time to commit to the project. I've added a maintainer last year in the hope that this would improve things.

Sickle looks also interesting, I think the main difference is that pyoai also contains a server implementation.

That being said, the code is stable and used in production for many years by many projects.
Hope this clarifies things a bit.

interrogator · 2020-04-24T18:08:03Z

Wow, two quick responses, much appreciated @wetneb @jascoul

Yes, the use of pyoai in numerous other projects is what initially swayed me, and I'm aware that OAI-PMH protocol itself also hasn't seen major changes in years. But the repo as it stands has a couple of red flags, no offense of course, it's a fantastic project and an asset for the community, no doubt. I want to hear what sickle's thoughts are too and then can make a choice.

Might be good if you guys made some kind of notice clarifying maintenance status in the README. If it's simply feature-complete, highlighting that with badges for test coverage (since you already have CI), listing projects that use pyoai etc., would probably be the things that would alleviate my concerns as a dev searching for the best dependencies possible. Linking to sickle (or any other similar project) would also be a classy move I suppose.

The main issue I see now, is that OAI-PMH protocol may see an update, and there's nobody here with the time to make the needed revisions. But I doubt they'd do something backwards incompatible, as it'd pose a problem for the entire ecosystem...

If I do use pyoai, and discover bugs, it would at least be nice if I could make PR and get it merged; a shame to see that a few people are maintaining forks with just a couple of lines of code changed. Please consider at least getting the PRs in, adding requirements.txt (with your two dependencies :P ), and doing a version bump. Since you're still around, even a patch release for 2020 would be great!

interrogator · 2020-04-24T18:15:12Z

Oh, and you would probably still pass tests if you add Python 3.7 and Python 3.8 to travis --- would be great if you were officially compatible with latest Pythons, as I'm using 3.7 for current project.

interrogator · 2020-04-24T18:34:06Z

Looked through issues, most are closeable, too. Made some comments, I think it'd really be worth the time to comment and close on the stale/fixed/malformed issues.

wetneb · 2020-04-24T19:58:19Z

Do you have a reason not to use sickle? If you have the energy to clear the backlog of pyoai issues that's great, but because pyoai is in use in many projects already you will not be able to change the API much, and some aspects of it are quite questionable.

For instance, if you use ListRecords and the query happens to return no results, it will raise an exception rather than return an empty generator. So I bet the first thing you will probably do is write a wrapper on your side to fix that. That's just one issue on top of my mind, not having used it for years.

(That being said, the project I used pyoai in still relies on it, so it is pretty much in my interest to convince you to fix things in it! I just want to be honest with you…)

mpasternak · 2020-04-24T22:36:07Z

Count me & my branch in https://github.com/mpasternak/pyoai?organization=mpasternak&organization=mpasternak

interrogator · 2020-04-25T13:51:36Z

I don't want to change the API at all, can work around known issues so long as they have workarounds. Many projects seem to rely on pyoai, so it should stay as stable as possible. But yeah it seems like the general health of the project can be improved just with a bit of reviewing, merging, documenting etc., I'm happy to participate in this if it means I have a dependency I can trust.

Right now I don't even know if my PRs would be accepted if I made them (i.e. adding py3.7 to travis and other little things like that). If @wetneb @jascoul confirm they will actually review and merge such things, I'd put in a little bit of work to get the project healthy again (as I tried to do by assessing state of the issues backlog).

Indeed I could end up using sickle, but as far as I can see, pyoai is still the 'canonical' code for the job, so I am happy to do both, make sure this project is healthy and then compare with sickle before deciding which suits my needs. Really, for this repo, it seems like very little work to bring it up to a nice stable state, so why not? An hour from each dev would probably get things up to where they should be --- there are no huge plans for major expansions or refactors, and existing users could stay version-locked if there ended up being a major/minor version bump (which I hope there will be!). But nobody has suggested any breaking changes, just small fixes to the code, 3.7 tests, maybe some more stuff in README...

So yeah, let me know if there's anything else in particular I can do to revive this project a little, aside from the issue review I just did. I'd like to Make PYOAI Great Again :D

krenzlin · 2020-05-08T11:02:05Z

Just to add my use case. I use the server implementation, so sickle is no option.
Only had to monkey patch cgi.parse_qs as it was removed in Python 3.7/3.8 (??)

interrogator · 2020-05-09T13:31:03Z

@krenzlin Is there an issue about that one? I'm on 3.7 and didn't realise that was an issue.

Can you describe fix in more detail if it's not online somewhere?

krenzlin · 2020-05-11T14:04:30Z

@interrogator
Actually it's with 3.8, where parse_qs was removed from cgi (cf. issue 33843).
Add 38 to tox and AttributeError: module 'cgi' has no attribute 'parse_qs' appears.

The easy fix (for 3.8) is just to replace the import.

diff --git a/src/oaipmh/server.py b/src/oaipmh/server.py
index 593a4ee..2de2704 100644
--- a/src/oaipmh/server.py
+++ b/src/oaipmh/server.py
@@ -2,10 +2,10 @@ from lxml.etree import ElementTree, Element, SubElement
 from lxml import etree
 from datetime import datetime
 try:
-    from urllib.parse import urlencode, quote, unquote
+    from urllib.parse import urlencode, quote, unquote, parse_qs
 except ImportError:
     from urllib import quote, unquote, urlencode
-import sys, cgi
+import sys
 
 from oaipmh import common, metadata, validation, error
 from oaipmh.datestamp import datestamp_to_datetime, datetime_to_datestamp, DatestampError
@@ -454,7 +454,7 @@ def decodeResumptionToken(token):
     token = str(unquote(token))
     
     try:
-        kw = cgi.parse_qs(token, True, True)
+        kw = parse_qs(token, True, True)
     except ValueError:
         raise error.BadResumptionTokenError(
               "Unable to decode resumption token: %s" % token)
diff --git a/tox.ini b/tox.ini
index c4f6d25..18be97c 100644
--- a/tox.ini
+++ b/tox.ini
@@ -1,5 +1,5 @@
 [tox]
-envlist = py{27,33,34,35,36}
+envlist = py{38}
 
 [testenv]
 changedir = src/oaipmh/tests

krenzlin · 2020-05-11T14:06:16Z

To support the older version we will need a version guard. At least in 2.7 the "new" parse_qs lives in urlparse renamed to urllib.parse in version 3.

interrogator · 2020-05-12T09:40:31Z

@wetneb will you merge the above if a fix is made? if so @krenzlin do you want to submit the patch?

I see no issue with just doing some:

try:
    import x
except ImportError:
    import y as x

I could also make the PR but it seems you've already done the fix, so...

wetneb · 2020-05-12T12:45:41Z

We are still stuck by the lack of Travis. Actually, I am thinking I could potentially try enabling GitHub actions instead: I don't think I need admin permissions to the repo, just write access should be enough.

I'll try to have a go at it today: #44

krenzlin · 2020-05-12T13:18:20Z

@interrogator Happy to contribute, but it will have to wait till tomorrow.

wetneb · 2020-05-12T14:38:39Z

@krenzlin looking forward to your PR: you should be able to add Python 3.8 to the test grid by editing .github/workflows/run_tests.yml and tox.ini.

interrogator · 2020-05-12T15:30:37Z

Thanks guys, great stuff. Let me know if there's something I can do to help.

krenzlin · 2020-05-14T10:15:20Z

@wetneb PR #46 was as easy as described by you.

wetneb · 2022-03-09T09:12:08Z

Let's declare that this library is maintained again, since we have a fresh release on PyPI :)

interrogator mentioned this issue Apr 24, 2020

Project stability and versioning? mloesch/sickle#40

Open

wetneb closed this as completed Mar 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pyoai health / maintenance status? #43

pyoai health / maintenance status? #43

interrogator commented Apr 24, 2020

wetneb commented Apr 24, 2020

jascoul commented Apr 24, 2020

interrogator commented Apr 24, 2020

interrogator commented Apr 24, 2020

interrogator commented Apr 24, 2020

wetneb commented Apr 24, 2020

mpasternak commented Apr 24, 2020

interrogator commented Apr 25, 2020

krenzlin commented May 8, 2020

interrogator commented May 9, 2020

krenzlin commented May 11, 2020

krenzlin commented May 11, 2020

interrogator commented May 12, 2020

wetneb commented May 12, 2020

krenzlin commented May 12, 2020

wetneb commented May 12, 2020

interrogator commented May 12, 2020

krenzlin commented May 14, 2020

wetneb commented Mar 9, 2022

pyoai health / maintenance status? #43

pyoai health / maintenance status? #43

Comments

interrogator commented Apr 24, 2020

wetneb commented Apr 24, 2020

jascoul commented Apr 24, 2020

interrogator commented Apr 24, 2020

interrogator commented Apr 24, 2020

interrogator commented Apr 24, 2020

wetneb commented Apr 24, 2020

mpasternak commented Apr 24, 2020

interrogator commented Apr 25, 2020

krenzlin commented May 8, 2020

interrogator commented May 9, 2020

krenzlin commented May 11, 2020

krenzlin commented May 11, 2020

interrogator commented May 12, 2020

wetneb commented May 12, 2020

krenzlin commented May 12, 2020

wetneb commented May 12, 2020

interrogator commented May 12, 2020

krenzlin commented May 14, 2020

wetneb commented Mar 9, 2022