Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-factor Sitemap plugin #3

Merged
merged 21 commits into from
Jul 24, 2023
Merged

Conversation

kernc
Copy link
Member

@kernc kernc commented May 2, 2020

This is an attempt at PoC simplification (ymmv) of the plugin. Comments and ideas welcome!

  • Act on every content_written signal to avoid guessing what pages to cover.
  • Makes it work correctly with i18n_subsites plugin (before, it created a separate sitemap file for each subsite). Ok, this actually doesn't work. No idea what I was testing before. 😢
  • Don't manually fiddle with timezones. Instead, expect article.date to be TZ-aware if required.
  • settings['SITEURL'] is modified within content_written signal. With RELATIVE_URLS=True, the resulting advertised file locations are likely incorrect.

kernc added 5 commits May 2, 2020 19:52
* Act on every `content_written` signal to avoid
  guessing what pages to cover.
* Makes it work correctly with i18n_subsites plugin (before,
  it created a separate sitemap file for each subsite).
* Don't manually fiddle with timezones. Instead, expect
  `article.date` to be TZ-aware if required.
* `settings['SITEURL']` is modified within
  `content_written` signal. With `RELATIVE_URLS=True`,
  the resulting advertised file locations are likely incorrect.
Copy link
Contributor

@avaris avaris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments. Overall, looks cleaner but I don't really like how files are used to "store state" between calls. I'd make this a class, keep state that way and write everything at the end.

pelican/plugins/sitemap/sitemap.py Outdated Show resolved Hide resolved
pelican/plugins/sitemap/sitemap.py Outdated Show resolved Hide resolved
pelican/plugins/sitemap/sitemap.py Outdated Show resolved Hide resolved
pelican/plugins/sitemap/sitemap.py Outdated Show resolved Hide resolved
pelican/plugins/sitemap/sitemap.py Outdated Show resolved Hide resolved
pelican/plugins/sitemap/sitemap.py Outdated Show resolved Hide resolved
pelican/plugins/sitemap/sitemap.py Outdated Show resolved Hide resolved
pelican/plugins/sitemap/sitemap.py Outdated Show resolved Hide resolved
@kernc
Copy link
Member Author

kernc commented May 3, 2020

I'd make this a class, keep state that way and write everything at the end.

Thanks for the review. That makes perfect sense. How would I go about passing the same instance through all the signals without making it a global variable?

@avaris
Copy link
Contributor

avaris commented May 3, 2020

Having a module global instance and registering methods of it to signals is fine in this case. Certainly, a lot more preferable to keeping state in files :).

trans.lang,
siteurl,
# save_as path is already output-relative
clean_url(pathname2url(trans.save_as)),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now works with i18n_subsites (:tada:), but here the produced URL in translation links is incorrect. trans.save_as is formatted according to e.g. PAGE_LANG_URL for all subsites (including the DEFAULT_LANG subsite where PAGE_URL is the one appropriate).

Ideas on how else to get the proper final URL/path on disk?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can someone else confirm this? i18n_subsites no longer build with my project. 😬 😂

@kernc
Copy link
Member Author

kernc commented Jun 5, 2020

@justinmayer if you can provide some rough test scaffold the way you want to carry it, I'd be happy to add remaining tests.

@kernc kernc requested a review from avaris June 5, 2020 00:30
@justinmayer
Copy link
Contributor

@justinmayer: If you can provide some rough test scaffold the way you want to carry it, I'd be happy to add remaining tests.

I am SO sorry to have taken so long to respond to this. I'm happy to update the CI environment to run pytest, in much the same way as it's done in the relevant section of the Pelican Plugin CookieCutter Template.

Beyond that, I would suggest taking a look at the way other plugins have implemented their tests and then use those as a guide to add tests for this Sitemap plugin.

Does that give you what you need to move this forward so we can get your excellent enhancements merged? 😅

@justinmayer
Copy link
Contributor

@kernc: I have added the necessary test machinery in the CI environment via #20.

@justinmayer
Copy link
Contributor

Hey @kernc! The new test machinery is in place in the CI environment. Any chance you could add some tests and rebase+resolve your enhancements on the current main branch?

@justinmayer
Copy link
Contributor

Hi @kernc! I added some very basic initial tests just to get the CI system to stop failing due to non-existent tests. In addition, #16 has been merged, and Sitemap 1.0.3 has been released, so when you have the time, I look forward to building on top of that and releasing a new version soon containing your excellent refactoring of the Sitemap plugin. ✨

@kernc
Copy link
Member Author

kernc commented Jul 14, 2023

I'm sorry to say I'll still be needing some help re tests that use the detestable tooling I'm unfamiliar with (poetry, pytest).
I guess we need to do an integration test with running Pelican proper. Can you point to an example test I can copy/learn from?

@justinmayer
Copy link
Contributor

I'm sorry that you are finding Poetry and Pytest to be unpleasant. What is it about this tooling that you find to be detestable?

Please keep in mind that there is nothing preventing you from using unittest from the Python standard library to write the tests. In fact, most of the tests for the plugins under this Pelican Plugins organization use unittest, with Pytest used only to run the tests.

If you decide to use Pytest methodology instead of unittest to write the tests, the following repositories in this Pelican Plugins organization appear to use Pytest-based tests:

  • image-preview-thumbnailer
  • image-process
  • linkbacks
  • mau-reader
  • seo
  • series
  • share-post
  • show-source

Does that help?

@kernc
Copy link
Member Author

kernc commented Jul 14, 2023

What is it about this tooling that you find to be detestable?

Namely the complexity, compatibility, opinionation (see this idiocy, for instance), and feature creep (see e.g. --help). You'd rightly expect after so many years the development would mature and stabilize, but the charts affirm otherwise. If the projects were any good, CPython would have inherited them by now. After five (or 50 or so) designated build systems (etc.), I'm still waiting for the one we settle on. And I'm now quite confident programming will be completely obsolete before then.

Nevermind the vehemence. I enjoy an opportunity to rant. 😆


OT:
I found help in 'markdown-include' addon. Basic covering test added—hope it's fine? One'd sure be challenged to assemble a lighter one with pytest.

Even though the PR changeset balance is now almost at break even, I'll wager the new code is much easier to follow.

@justinmayer
Copy link
Contributor

I appreciate a good rant. 😉 I originally added Poetry to the toolchain because at the time it provided (me) some very real benefits, but over time I have come to the conclusion that it tries to do too much and in a way that does not conform to standards that have coalesced since Poetry was created. Before fully throwing in the towel on some of the benefits provided by more modern tooling, I have done some experiments with Hatchling + PDM and have found that combination to be pleasant as well as more standards-compliant than Poetry. That said, I am still evaluating — the jury remains out.

The “idiocy” you refer to above actually has nothing to do with Poetry, by the way, and is instead the result of CI enforcing linter rules — in this case, Black. One could certainly make a case against Black, of course, but I at least wanted to make sure the vehemence is directed at the right target. 😊

Speaking of linters, I added a commit (018a4c9) to resolve a few things that were causing Ruff to report compliance problems in CI.

Copy link
Contributor

@justinmayer justinmayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good to me. Any follow-up thoughts, @avaris?

Copy link
Contributor

@avaris avaris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

@justinmayer
Copy link
Contributor

@kernc: One last question so I can ensure the change-log entry is accurate… When comparing XML files before and after this PR, I noticed the following differences:

  • /archives/ has been added to the list of indexed items
  • items appear in a different order than before

Are you aware of any other changes to the output that should be noted in the change-log?

@justinmayer justinmayer changed the title Refactor sitemap Re-factor Sitemap plugin Jul 24, 2023
@justinmayer
Copy link
Contributor

Many thanks to @kernc for the significant enhancements and for all the patience along the road to getting this merged. Thanks also to @avaris for reviewing! ✨

@justinmayer justinmayer merged commit a78416c into pelican-plugins:main Jul 24, 2023
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants