Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add generated_by attribute to File class #3344

Closed
wants to merge 4 commits into from
Closed

Conversation

timvink
Copy link
Contributor

@timvink timvink commented Aug 22, 2023

Plugins like mkdocs-gen-files and the mkdocs-material blog plugin create new files that are added to the site dynamically.

By adding a generated_by attribute to the File class that plugin developers can use, other plugins can improve interopt.

This PR is meant to open the discussion, feel free to close the PR if this is not the right direction / not useful.

Plugins like `mkdocs-gen-files` and the `mkdocs-material` blog plugin create new files that are added to the site dynamically.

By adding a `generated_by` attribute to the `File`  class, other plugins can improve interopt.
Add `generated_by` attribute to `File` class
@oprypin
Copy link
Contributor

oprypin commented Aug 22, 2023

I'm not sure where I stand on this decision, but I think for this to be more useful, normally it shouldn't be shown as "generated" by mkdocs, because these are based on real files. Or regardless of that, the default should just be None.

@squidfunk
Copy link
Sponsor Contributor

squidfunk commented Aug 22, 2023

I would love to see this in MkDocs, as it would provide a unified way of signaling that a file was generated. I second that the default should be set to None for files that are collected and managed by MkDocs. The main point why I'd like to see it is that typings would show the field when accessing File. Another idea might be to assign a plugin instance instead of a file for further processing and interaction by other plugins, but that might not work with hooks.

@ultrabug
Copy link
Member

As a plugin developer while I find the idea seducing I fail at getting a clear use case but that's maybe my own bias. I don't want to sound like putting down the idea, I just need help in understanding what missing feature that would bring to us plugin devs.

other plugins can improve interopt.

Can you explain your use case maybe @timvink?

typings would show the field when accessing File

That sounds like a convenience but not a concrete feature to me, why would that be that helpful to you?

Another idea might be to assign a plugin instance instead of a file

I would assign the plugin name since we can then get access to it via the PluginCollection class.

As for the default to None, I agree with you.

@squidfunk
Copy link
Sponsor Contributor

Can you explain your use case maybe @timvink?

The blog plugin must make use of generated_by to notify the git-revision-date-localized plugin that a file is generated, like category and archive index pages, the latter of which will not try to resolve information from git for it.

That sounds like a convenience but not a concrete feature to me, why would that be that helpful to you?

It would promote a standardized way for plugins to distinguish generated files.

I would assign the plugin name since we can then get access to it via the PluginCollection class.

How about multi instance capable plugins? For Material for MkDocs, those include: blog, optimize, privacy, social, tags.

@oprypin
Copy link
Contributor

oprypin commented Aug 23, 2023

The blog plugin must make use of generated_by to notify the git-revision-date-localized plugin that a file is generated, like category and archive index pages,

What that plugin actually wants to determine is whether a document is backed by a real file. Which I think is probably already possible to do in another way - by checking for discrepancies in paths or something like that - and this way can be made official and easier to use.
It's not that we "must" notify it, it's just that we didn't dig for a correct solution and I just accepted a workaround in one plugin if that helps make the other plugin work better, because I took way too long to respond there.

@ultrabug
Copy link
Member

@squidfunk thanks, I understand better how this can be a real feature.

I have the feeling that if mkdocs itself was responsible for populating this field automatically it would be even better (that's how I imagined it in the first place). Imagine if this generated_by was present for all plugins, we also would benefiting from cool debugging capabilities.

How about multi instance capable plugins? For Material for MkDocs, those include: blog, optimize, privacy, social, tags.

Since my initial idea was that mkdocs itself would populate this field and knowing that newer mkdocs do prefix plugins with their name like material/blog it would have worked easily. That's what you're already doing with the blog plugin anyway so maybe my English was not precise enough. All good here.

@squidfunk
Copy link
Sponsor Contributor

squidfunk commented Aug 23, 2023

I have the feeling that if mkdocs itself was responsible for populating this field automatically it would be even better (that's how I imagined it in the first place). Imagine if this generated_by was present for all plugins, we also would benefiting from cool debugging capabilities.

From my experience writing plugins, I'm not sure how that should work reliably.

Since my initial idea was that mkdocs itself would populate this field and knowing that newer mkdocs do prefix plugins with their name like material/blog it would have worked easily. That's what you're already doing with the blog plugin anyway so maybe my English was not precise enough. All good here.

The current solution is far from ideal, as it does not support multi-instance as far as I know. There's no way to know which instance generated the file if you only have the name of the plugin, AFAIK.

@ultrabug
Copy link
Member

From my experience writing plugins, I'm not sure how that should work reliably.

You must be right but in case it can help thinking about it I allowed myself a little POC in #3345 : food for thoughts (or not).

@squidfunk
Copy link
Sponsor Contributor

@ultrabug: another more verbose idea might be to provide a subclass of file that is called GeneratedFile (or something similar), which somehow infers or allows to set the name of the plugin. It would be more explicit.

@oprypin
Copy link
Contributor

oprypin commented Aug 23, 2023

To expand on my comment:

What that plugin actually wants to determine is whether a document is backed by a real file. Which I think is probably already possible to do in another way

https://github.com/timvink/mkdocs-git-revision-date-localized-plugin/blob/6150fad41a83abbf7500cf9c75696c5193532305/mkdocs_git_revision_date_localized_plugin/plugin.py#L202

I mean that the plugin can just replace this code and then the attribute doesn't need to be set:

if getattr(file, "generated_by"):  # relies on plugins setting the value
if Path(file.abs_src_path).is_relative_to(config.docs_dir):

Although maybe this is not fully reliable with multi-instance plugins such as "monorepo" "multirepo" "static-i18n"

@pawamoy
Copy link
Sponsor Contributor

pawamoy commented Aug 23, 2023

@timvink indeed mentions some edge cases that would probably not be well supported by this approach of checking the file absolute path against the config file or docs dir path.

@oprypin
Copy link
Contributor

oprypin commented Aug 23, 2023

Great input! I didn't see that

@timvink
Copy link
Contributor Author

timvink commented Aug 26, 2023

Great discussion!

Indeed the core issue for me is determining whether a File is backed by a real file. Trying to determine that relative to a docs or config directory got me into tricky edge cases trying to have compatibility with for example monorepo.

I really like the proposal by @ultrabug of having mkdocs set this attribute automatically. Much less work for plugin maintainers and much more reliable (because it will also work for new plugins).

The generated_by attribute might unlock new use cases for plugin developers also that we don't see yet & potentially improve debugging also.

@timvink timvink marked this pull request as draft August 26, 2023 19:24
@phil65
Copy link
Contributor

phil65 commented Sep 5, 2023

Instead of a generated_by attribute, perhaps it´s more future-proof to just add a generic File.metadata dictionary where plugin authors can store any additional info.

@squidfunk
Copy link
Sponsor Contributor

Instead of a generated_by attribute, perhaps it´s more future-proof to just add a generic File.metadata dictionary where plugin authors can store any additional info.

That'd defeat the purpose of providing a unified way of saying "this file is generated". However, I agree that, in general, adding meta to File might be useful.

@phil65
Copy link
Contributor

phil65 commented Sep 5, 2023

Instead of a generated_by attribute, perhaps it´s more future-proof to just add a generic File.metadata dictionary where plugin authors can store any additional info.

That'd defeat the purpose of providing a unified way of saying "this file is generated". However, I agree that, in general, adding meta to File might be useful.

That´s true.
I don´t know about your specific use cases, but wouldnt it make more sense to attach metadata to StructureItems and not to Files? (Page already carries metadata-ish info in form of a build date).

@ultrabug
Copy link
Member

ultrabug commented Sep 6, 2023

That'd defeat the purpose of providing a unified way of saying "this file is generated". However, I agree that, in general, adding meta to File might be useful.

I would refrain in using the meta (and even metadata) in the context of Markdown related stuff.

Page.meta is about MarkDown files metadata header; I can only foresee confusion if we start using this word in other places in the code just to be "future proof". I'd stick with simple solutions to real use cases imho.

@phil65
Copy link
Contributor

phil65 commented Sep 6, 2023

Of course naming could be discussed, metadata is perhaps not the best fit.
As an example, pandas calls something similar "attrs" ( https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.attrs.html )

@squidfunk
Copy link
Sponsor Contributor

Coming back here after 2 months to check if there's any progress. With MkDocs being adopted by more users and more plugins appearing that generate files, we need a standard way to signal that a file is generated. What are the current plans by the maintainers? I'm asking, since this issue/PR is not scheduled for the 1.6.0 milestone.

@oprypin
Copy link
Contributor

oprypin commented Oct 21, 2023

Let's approve this generated_by attribute. The code change itself (defining the attribute) is not urgent because plugins should keep using if getattr(file, 'generated_by', None): as the check for the foreseeable future anyway. But yes we'll get the change in.

Now I am working on a big change to improve the approach to generating files - obvious for plugins and easy enough even for hooks; no more messing with temporary directories. The addition of the generated_by attribute will just be a small piece that fits into that design.

@oprypin
Copy link
Contributor

oprypin commented Oct 21, 2023

The current solution is far from ideal, as it does not support multi-instance as far as I know. There's no way to know which instance generated the file if you only have the name of the plugin, AFAIK.

Oh I didn't expect this actually. Do you think it would make sense to sometimes get ids such as generated_by='material/blog #2'? I was going to strip them instead

@squidfunk
Copy link
Sponsor Contributor

squidfunk commented Oct 22, 2023

Oh I didn't expect this actually. Do you think it would make sense to sometimes get ids such as generated_by='material/blog #2'? I was going to strip them instead

I'm not sure what the best value is that should go into generated_by. I discussed some ideas in my comments in this issue, one of them being the plugin instance itself. That way another plugin might be able to query the state of the plugin. However, if the fully qualified name could be used to retrieve the plugin instance, that could also be sufficient.

@oprypin
Copy link
Contributor

oprypin commented Dec 16, 2023

I have completed work on my pull request which also adds a convenient way to create a generated file for a Files collection.

Most notable commits:

Description:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants