Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modernize ludios_wpull #25

Merged
merged 47 commits into from
Jan 12, 2024
Merged

Modernize ludios_wpull #25

merged 47 commits into from
Jan 12, 2024

Conversation

HeliosLHC
Copy link
Collaborator

@HeliosLHC HeliosLHC commented Jan 7, 2024

This is a continuation of the modernization efforts for ludios_wpull to support newer Python versions.

Python Version Change
Python 3.12 was selected due to a potential regression identified in Python 3.11.X

Wpull Version Change
This will raise the version from 3.0.9 to 5.0.0a (4.X.X is skipped per recommendation from @JustAnotherArchivist as my other internal fork of ludios_wpull is using it). The "a" is to denote this is an alpha-release.

Changes

  • Bump version from 3.0.9 to 5.0.0a
  • Replace deprecated setup.py with pyproject.toml
  • Refactoring of deprecated async code to use modern asyncio (async/await) syntax
  • Dependencies
    • Upgrade Tornado to version 6 and refactor code to support it
    • Upgrade SQLAlchemy to version 2 and refactor code to support it
    • Upgrade yapsy (master branch) to latest version to support Python 3.10+
      • Latest release version is very old and doesn't support newer Python versions
    • Add packaging dependency used by yapsy (does not recursively resolve)
  • Change name written to WARC metadata from "Wpull" to "ludios_wpull" per @JustAnotherArchivist 's recommendation
  • Unit Tests
    • Fixed the majority of the existing broken unit tests
    • Refactor unit tests to use modern asyncio, IsolatedAsyncioTestCase, and Tornado's gen_test where possible
    • Replace deprecated unit test assertion methods
  • Tornado
    • Removed deprecated Tornado code
    • Replace custom ConcurrentHTTPServer with now native ThreadingHTTPServer
  • Remove logic for handling older deprecated Python versions (below 3.7)
  • Replace namedlist with dataclasses where possible
  • Replaced deprecated sub classing of object and certain collection types
  • Refactor certficate and SSL related code to support changed behaviors to Python's stdlib SSL library

Remaining Broken Unit Tests
These are unit tests that remain broken from the existing ludios_wpull version or will introduced

  1. There are 1 remaining unit test that were broken on Python 3.8 and are still broken on Python 3.12 which is: Fix/remove wpull.proxy.proxy_test.TestProxySSL #8

  2. There's a broken unit test related to how Python 3.11 or Python 3.12 changed socket re-use behavior. This impacts unit test: wpull.network.connection_test.TestConnection.test_sock_reuse

Todo

  • Stabilize version 5.0.0a to make it to beta/RC and final release as 5.0.0 or higher.
  • Upgrade grab-site

HeliosLHC and others added 30 commits September 10, 2023 17:02
replaced namedlist and ordereddefaultdict with stdlib implementations
@HeliosLHC HeliosLHC added enhancement New feature or request dependencies Pull requests that update a dependency file labels Jan 7, 2024
@HeliosLHC HeliosLHC merged commit 7b4ed99 into master Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants