Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bitnami/mlflow] feat: Allow database dialects of external database to be configured #25965

Merged
merged 4 commits into from
Jun 4, 2024

Conversation

frittentheke
Copy link
Contributor

Description of the change

Currently the database URI statically uses postgresql as dialect, even though other types are supported by mlFlow.
This now allows to use the other supported database types as external database.

[1] https://mlflow.org/docs/latest/tracking/backend-stores.html?highlight=mssql#supported-store-types

Benefits

Other (mlFlow supported) database types can now be used as external databases

Possible drawbacks

Applicable issues

Additional information

Checklist

  • Chart version bumped in Chart.yaml according to semver. This is not necessary when the changes only affect README.md files.
  • Variables are documented in the values.yaml and added to the README.md using readme-generator-for-helm
  • Title of the pull request follows this pattern [bitnami/<name_of_the_chart>] Descriptive title
  • All commits signed off and in agreement of Developer Certificate of Origin (DCO)

@github-actions github-actions bot added mlflow triage Triage is needed labels May 17, 2024
@github-actions github-actions bot requested a review from carrodher May 17, 2024 10:08
@carrodher carrodher added verify Execute verification workflow for these changes in-progress labels May 18, 2024
@github-actions github-actions bot removed the triage Triage is needed label May 18, 2024
@github-actions github-actions bot removed the request for review from carrodher May 18, 2024 16:37
@github-actions github-actions bot requested a review from fmulero May 18, 2024 16:37
Copy link
Collaborator

@fmulero fmulero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @frittentheke for your contribution!

It looks great and very useful. I left a little suggestion, about the version used. Could you give it a glance?

bitnami/mlflow/Chart.yaml Outdated Show resolved Hide resolved
… be configured

Currently the database URI statically uses `postgresql` as dialect, even though
other types are supported by mlFlow.
This now allows to use the other supported database types as external database.

[1] https://mlflow.org/docs/latest/tracking/backend-stores.html?highlight=mssql#supported-store-types

Resolves: bitnami#25964
Signed-off-by: Christian Rohmann <christian.rohmann@inovex.de>
Signed-off-by: Bitnami Containers <bitnami-bot@vmware.com>
@frittentheke
Copy link
Contributor Author

Thanks a lot @frittentheke for your contribution!

It looks great and very useful. I left a little suggestion, about the version used. Could you give it a glance?

DONE @fmulero . PTAL.

fmulero
fmulero previously approved these changes Jun 3, 2024
Copy link
Collaborator

@fmulero fmulero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @frittentheke

LGTM

@fmulero fmulero enabled auto-merge (squash) June 3, 2024 16:33
@frittentheke
Copy link
Contributor Author

Thanks a lot @frittentheke

LGTM

Awesome @fmulero . Could you kindly check what is wrong in CI (https://github.com/bitnami/charts/actions/runs/9291472195/job/25745528664?pr=25965) keeping this from getting merged?

Signed-off-by: Carlos Rodríguez Hernández <carlosrh@vmware.com>
carrodher
carrodher previously approved these changes Jun 4, 2024
Signed-off-by: Bitnami Containers <bitnami-bot@vmware.com>
@fmulero fmulero disabled auto-merge June 4, 2024 10:35
@fmulero fmulero merged commit 6e2e370 into bitnami:main Jun 4, 2024
8 checks passed
@fmulero
Copy link
Collaborator

fmulero commented Jun 4, 2024

Thanks a lot @frittentheke
LGTM

Awesome @fmulero . Could you kindly check what is wrong in CI (https://github.com/bitnami/charts/actions/runs/9291472195/job/25745528664?pr=25965) keeping this from getting merged?

That was a temporary issue, not related with your changes

@frittentheke
Copy link
Contributor Author

Thanks @fmulero ! May I kindly ask you for an answer on how and when the container image will be updated?
See #22992 (comment)

I suppose #22992 is actually also resolved by this very pull request, provided the required libraries for MySQL + PostgreSQL have also been added.

@frittentheke
Copy link
Contributor Author

@juan131 I read your post #22720 (comment) about added Python modules.

May I kindly ask you to also add the modules to support different database engines?
See comment #22992 (comment)

@javsalgar did mention this request was forwarded to your container team already, but that was back in march: #22992 (comment)

@juan131
Copy link
Contributor

juan131 commented Jun 28, 2024

Hi @frittentheke

As I mentioned on this comment, we want the image to include only the most important pip modules exclusively. This is sth very important to keep the container small and reduce the attack surface from a security point of view.

We add the module for PostgreSQL given it's the default database we use in the Helm chart, the same way we've decided to add the modules for AWS S3 and Google Cloud Storage given they're the most extendedly used cloud solutions for object storage. However, we cannot extend this to every database (or object storage solution) supported by MLFlow given we'll be increasing the number of dependencies significantly. That's why we recommend to extend the Bitnami image adding the required modules for those cases that aren't supported by default.

@frittentheke
Copy link
Contributor Author

As I mentioned on this comment, we want the image to include only the most important pip modules exclusively. This is sth very important to keep the container small and reduce the attack surface from a security point of view.

Sorry @juan131, I totally missed that you indeed already responded.

We add the module for PostgreSQL given it's the default database we use in the Helm chart, the same way we've decided to add the modules for AWS S3 and Google Cloud Storage given they're the most extendedly used cloud solutions for object storage. However, we cannot extend this to every database (or object storage solution) supported by MLFlow given we'll be increasing the number of dependencies significantly. That's why we recommend to extend the Bitnami image adding the required modules for those cases that aren't supported by default.

That's reasonable. I just thought that since the list of database engines / dialects MLflow supports in finite:
mysql, mssql, sqlite, and postgresql (https://mlflow.org/docs/latest/tracking/backend-stores.html#supported-store-types), it might make sense to support them (all) - given that with this PR here it's possible to configure them via the chart.

@Matroxt
Copy link

Matroxt commented Jul 23, 2024

Hi

May I ask why this PR handled the dialect for mlflow.v0.database-auth.uri but not mlflow.v0.database.uri?

Currently, mlflow.v0.database.uri is hardcoded to postgresql which results in the backend-store-uri of mlflow-tracking to always be postgres.

Was it just forgotten? I can open a PR if that,s the case.

@juan131
Copy link
Contributor

juan131 commented Jul 24, 2024

Hi @Matroxt

Please create a PR addressing this particular issue, that'd be highly appreciated. Thanks in advance.

frittentheke added a commit to frittentheke/bitnami-charts that referenced this pull request Jul 24, 2024
…red (bitnami#25965)

This is therefor just a fixup to fix the auth database URI generation.
The previous commit PR bitnami#25965 did not apply the dialect template / config to
the uri of the auth database.

Fixes: bitnami#25965
frittentheke added a commit to frittentheke/bitnami-charts that referenced this pull request Jul 24, 2024
…red (bitnami#25965)

This is a fixup for the auth database URI generation. The previous commit PR bitnami#25965
did not apply the dialect template / config to the uri of the auth database.

Fixes: bitnami#25965
Signed-off-by: Christian Rohmann <christian.rohmann@inovex.de>
frittentheke added a commit to frittentheke/bitnami-charts that referenced this pull request Jul 24, 2024
…red (bitnami#25965)

This is a fixup for the auth database URI generation. The previous commit PR bitnami#25965
did not apply the dialect template / config to the uri of the auth database.

Fixes: bitnami#25965
Signed-off-by: Christian Rohmann <christian.rohmann@inovex.de>
@frittentheke
Copy link
Contributor Author

May I ask why this PR handled the dialect for mlflow.v0.database-auth.uri but not mlflow.v0.database.uri?

Currently, mlflow.v0.database.uri is hardcoded to postgresql which results in the backend-store-uri of mlflow-tracking to always be postgres.

@Matroxt Sorry about that, I quickly pushed a PR #28316 already.

@juan131 if you could kindly review that one?

@Matroxt
Copy link

Matroxt commented Jul 24, 2024

Awesome, thanks for the very quick fix ❤️

datbui pushed a commit to datbui/charts that referenced this pull request Jul 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mlflow solved verify Execute verification workflow for these changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MLFlow allow using other supported database backends as external database
6 participants