Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API] Add marketplace apis #1080

Merged
merged 26 commits into from Aug 2, 2021
Merged

Conversation

theSaarco
Copy link
Member

@theSaarco theSaarco commented Jul 1, 2021

Add MLRun APIs to handle marketplace functionality. This is an initial PR which only contains the APIs and the logic behind them. It still is missing the following main components:

  1. MLRun logic to utilize this mechanism when using functions from the marketplace (such as hub://describe).
  2. Some changes in marketplace structure and format are still expected. These will evolve as the marketplace generation facility matures.
  3. UI utilization of these APIs (both as a static HTML and from the MLRun UI)

In this PR, the following APIs are added:

Marketplace sources CRUD APIs

POST /marketplace/sources
Add a new marketplace source. The payload is of type schames.IndexedMarketplaceSource. For example:

{
        "index": -1,      # -1 for last, else must be >0.
        "source": {
            "kind": "MarketplaceSource",
            "metadata": {"name": "MLRun function source", "description": "A test"},
            "spec": {"path": "https://...", "channel": "development", "credentials": {"access_key": "..."}},
            "status": {"state": "created"},
        },
}

PUT /marketplace/sources/{source_name}
Create or modify a source (or its order in the list). Same payload as the one used for POST.

DELETE /marketplace/sources/{source_name}
Delete a source.

GET /marketplace/sources
GET /marketplace/sources/{source_name}
List (1st API) and get (2nd api) for sources. Return a list or a single schames.IndexedMarketplaceSource objects.

Default marketplace source

By default, the DB is populated with the MLRun global marketplace source, and its location is always -1, which means it will always be last. It cannot be modified or changed (including its order in the list of sources). For example, if a source is posted with order -1, it will actually be inserted to the end of the list (1 if it's the 1st item etc.).
The default source is returned in list/query APIs and is generally a read-only item.
The details of the default source are all in the config file, under marketplace.default_source. To avoid automatically creating the source (for example if it's a dark site and there's no use in it), set marketplace.default_source.create=False.

Marketplace catalog and item APIs

GET /marketplace/sources/{source_name}/items
Retrieve the marketplace items catalog for this specific source. Returns a schemas.MarketplaceCatalog object. Supports filtering on various parameters of the items, using query parameters. Query params available:

  • channel - filter items by specific channel
  • version - filter items by a version
  • tag - filter items by tag
  • force-refresh - force the server to retrieve the catalog again from the source. Otherwise the result will be based on the cached information if previously retrieved. Use this in case changes were done to the source structure (adding functions, re-building functions etc.)

GET /marketplace/sources/{source_name}/items/{item_name}
Get a single item from the catalog. Query params are the same as in list API, and allow to filter the items based on their metadata. However, in this case they have default values which usually will return a single default item even when not specifying any value. In general all params are required to generate a query for a single item:

  • channel - default is development
  • version - default is None
  • tag - default is latest
  • force-refresh - same as in the catalog API, will force the server to retrieve the catalog from the source.

GET /marketplace/sources/{source_name}/item-object
Retrieve the file contents in the specified URL. If the source is registered with credentials, they will be used to access the objet.
Mandatory query params:

  • url - URL of the item to retrieve from the marketplace.

Also, httpdb API were added (and documentation was provided) to cover the marketplace functionality.

@theSaarco theSaarco marked this pull request as draft July 1, 2021 14:57
@theSaarco theSaarco closed this Jul 1, 2021
@theSaarco theSaarco reopened this Jul 1, 2021
@theSaarco theSaarco marked this pull request as ready for review July 12, 2021 05:54
Copy link
Contributor

@Hedingber Hedingber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some things that are not clear to me:

  1. Why the global marketplace is always last ?
  2. I'm not sure I understand what "channel" tries to be abstraction for, in github context it's simply branch, in other context it's what ?
  3. When first reading the PR description it wasn't clear to me when you say you allow to filter by version, what is the version of, please note to make it clear it's the version of the items (especially in the httpdb docstrings, PR description as well)
  4. I don't fully understand what between tag and version

mlrun/api/api/endpoints/marketplace.py Outdated Show resolved Hide resolved
source: OrderedMarketplaceSource,
db_session: Session = Depends(mlrun.api.api.deps.get_db_session),
):
get_db().create_marketplace_source(db_session, source)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commenting once on this - you'll need to create a CRUD layer that will enforce OPA permissions

mlrun/api/api/endpoints/marketplace.py Show resolved Hide resolved
mlrun/api/crud/function_marketplace.py Outdated Show resolved Hide resolved
mlrun/api/crud/function_marketplace.py Outdated Show resolved Hide resolved
mlrun/api/crud/function_marketplace.py Outdated Show resolved Hide resolved
)
return order

def create_marketplace_source(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add logs to create store and delete

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what logs are missing? Log every time these methods are called? This is not done in other places in SQLDB, so why here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally we don't have enough logs in code, but you can see new (or actually ones that was recently refactored) APIs such as the projects ones, do have logs (in the DB layer), let's add them

mlrun/api/initial_data.py Show resolved Hide resolved
mlrun/api/schemas/marketplace.py Outdated Show resolved Hide resolved
tests/rundb/test_httpdb.py Outdated Show resolved Hide resolved
Copy link
Contributor

@Hedingber Hedingber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good to me, just:

  1. Add logs in the DB layer
  2. Add OPA enforcement (require some offline discussion)
  3. explain me the questions I had in the previous CR (let's do it offline as well)

@theSaarco
Copy link
Member Author

Some things that are not clear to me:

  1. Why the global marketplace is always last ?
  2. I'm not sure I understand what "channel" tries to be abstraction for, in github context it's simply branch, in other context it's what ?
  3. When first reading the PR description it wasn't clear to me when you say you allow to filter by version, what is the version of, please note to make it clear it's the version of the items (especially in the httpdb docstrings, PR description as well)
  4. I don't fully understand what between tag and version

@Hedingber, some answers to your questions:

  1. It's always last since the logic is that any marketplace source added always overrides the default global marketplace (since it's a local marketplace). Therefore the logic of selecting a function is expected to first retrieve the function from a user-provided marketplace and fall back to the global marketplace.
  2. Channel in GH will indeed usually map to a branch. I have no other examples at the moment to channels, but I try to figure out the purpose of this question - are you asking why we don't call it branch? Just to note, these terms are determined from the marketplace creation tool, and I'm just reflecting them here.
  3. The PR description is fixed. I'll modify the httpdb docstring as well.
  4. Again, this is how the catalog is structured and I'm reflecting it. In general the tag is user-generated and version is sort of an automatic version (or close to) such as 1.0.0.

Copy link
Contributor

@Hedingber Hedingber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So

  1. Adding OPA requires some further design and support from other components
  2. Things like what's between version and tag are not "sourced" from this code, this code just reflect the way it's designed in the new marketplace
  3. This PR holds a lot of code generally, and specifically some code in the k8s util + secrets area that need need to be used by other components

So merging it for now, we will continue discussions regarding 1 and 2 offline and handle in follow up PRs if needed

@Hedingber Hedingber changed the title [API] Marketplace apis [API] Add marketplace apis Aug 2, 2021
@Hedingber Hedingber merged commit f03ad13 into mlrun:development Aug 2, 2021
ordered_source = get_db().get_marketplace_source(db_session, source_name)
return Marketplace().get_source_catalog(
ordered_source.source, channel, version, tag, force_refresh
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like docker registry V2 HTTP, I think catalog should be a distinct endpoint since catalog != list of items

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The REST API itself is GET /marketplace/sources/{source_name}/items - we don't use catalog in the REST endpoint at all. This is the internal api name (which may make sense to rename to get_source_items instead).

from mlrun.datastore import store_manager

# Using a complex separator, as it's less likely someone will use it in a real secret name
secret_name_separator = "-__-"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_ is not a valid char in secret names (or any resource names) in k8s.
Do we use the same name-label validations here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not the secret name that is affected - it's the key within the secret. There is no restriction on using _ in that aspect. The secret name itself is handled elsewhere (in the project-secret mechanism, we just assign an internal "project name" for the marketplace sources)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants