-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic catalogs #242
Comments
This is very similar to the example workflow in intake/intake-thredds#2 , and elsewhere, where a query is made when the cat is instantiated, and then catalogue entries are generated from that. Dynamic catalogues are very important to Intake! |
Ah awesome! I'll take a look at that example. |
@jacobtomlinson , how are things going with catalogs and your various plugins? Are you still developing, do you need any pointers? |
Things are progressing, although we are focusing on the data sets themselves and zarr at the moment. |
I am also working on a similar idea to what @jacobtomlinson describes over at intake-dcat, and have run into some similar issues. I would like to be able to read in a remote catalog, convert it to something intake-friendly, and then re-export that intake catalog for use. Some specific ideas that I think might be useful to have on the base
The user could then provide their own functions/lambdas for operating on catalogs and producing new ones. Is there any interest in these ideas? Do they already exist in the interfaces and I have just missed them? |
These are all good ideas, and along the lines of what I was intending on working on in the immediate future. The iterator methods are interesting; but they basically amount to viewing the cat as an iterable (which it is) and being able to construct a new cat as Catalog(dict-like-of-entries), which you cannot, but should be able to. You can fully serialise a catalog with However, you might not want to explicitly save the catalog to YAML, maybe you want the truly dynamic version; or use the persist() mechanism to have periodically updating snapshots of a catalog service which changes only occasionally. |
Yes, I suppose that constructing a new catalog with something like a dictionary comprehensions might be more idiomatic. If we were able to do that, then my suggestions above should be doable with builtin functions. The code for |
I'm currently doing some work with a data API which provides multiple CSV datasets. I ended up writing a quick notebook which gets the file manifest from the API, converts the format into the intake catalog style and outputs a yaml file. We can then use the intake catalog normally.
This API is adding new datasets all the time and I would like to keep my catalog up to date. I could set up a cron job somewhere to periodically generate the manifest. However I was thinking perhaps I could go a step further and write an intake plugin which adds the sources on import.
The API I'm working with is a pretty standard one, so I could envisage writing some kind of manifest which points to the specific implementation of the API. But it would then get the manifest, generate the catalog and add all the datasets to the built in catalog.
Could anyone provide me with some guidance on how I could implement this?
The text was updated successfully, but these errors were encountered: