Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] Export mlflow:::mlflow_write_model_spec() #10349

Open
2 of 22 tasks
lewinfox opened this issue Nov 10, 2023 · 4 comments · May be fixed by #10507
Open
2 of 22 tasks

[FR] Export mlflow:::mlflow_write_model_spec() #10349

lewinfox opened this issue Nov 10, 2023 · 4 comments · May be fixed by #10507
Labels
area/models MLmodel format, model serialization/deserialization, flavors enhancement New feature or request has-closing-pr This issue has a closing PR language/r R APIs and clients

Comments

@lewinfox
Copy link

Willingness to contribute

Yes. I can contribute this feature independently.

Proposal Summary

I would like to request that mlflow:::mlflow_write_model_spec() is @exported from the package.

Motivation

What is the use case for this feature?

My organisation is using MLFlow as a model registry with R. We have a need to store model flavors that don't fit exactly within the carrier::crate() flavor, so I've written a couple of custom ones. This works really well, but R CMD CHECK is unhappy about my use of the unexported mlflow:::mlflow_write_mdel_spec() function.

Why is this use case valuable to support for MLflow users in general?

I am not aware of a way of building custom MLFlow flavors in R without using this function (short of writing my own function that duplicates the functionality). MLFlow is an awesome product and, once we have these custom flavors implemented, will serve our needs very well.

Why is this use case valuable to support for your project(s) or organization?

We would like to store models in MLFlow that do not conform to the carrier::crate() flavor, which means (as far as I can tell, anyway) we need to implement custom flavor(s).

Why is it currently difficult to achieve this use case?

R CMD CHECK doesn't like the use of unexported functions. This isn't a difficulty per se, but making this change would remove some friction and make the process more developer-friendly.

Details

Here's a flavor implementation for storing arbitrary R objects in MLFlow (the most basic example I could think of). If there's a way of achieving the same result without using mlflow:::mlflow_write_model_spec() I'd love to hear about it.

#' Wrap an R object in preparation for upload to `MLFlow`
#'
#' `rblob` is a generic flavor that can store any R object. Only one object can 
#' be stored - if you want to store multiple objects you can put them in a `list()`.
#'
#' @param object The object to wrap and store in MLFlow.
#'
#' @export
#'
#' @return An `rblob` object ready for use with MLFlow.
rblob <- function(object) {
  class(object) <- c("rblob", class(object))
  object
}

#' @exportS3Method mlflow::mlflow_save_model rblob
mlflow_save_model.rblob <- function(model, path, model_spec = list(), ...) {
  
  if (dir.exists(path)) unlink(path, recursive = TRUE)
  dir.create(path)

  serialized <- serialize(model, NULL)
  saveRDS(serialized, file.path(path, "rblob.Rds"))

  model_spec$flavors <- append(
    model_spec$flavors,
      list(
        rblob = list(
          version = "0.0.1",
          model = "rblob.Rds"
        )
      )
    )

  mlflow:::mlflow_write_model_spec(path, model_spec) # Use of unexported function
  model_spec
}

#' @exportS3Method mlflow::mlflow_load_flavor mlflow_flavor_rblob
mlflow_load_flavor.mlflow_flavor_rblob <- function(flavor, model_path) {
  unserialize(readRDS(file.path(model_path, "rblob.Rds")))
}

What component(s) does this bug affect?

  • area/artifacts: Artifact stores and artifact logging
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages
  • area/examples: Example code
  • area/gateway: AI Gateway service, Gateway client APIs, third-party Gateway integrations
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/recipes: Recipes, Recipe APIs, Recipe configs, Recipe Templates
  • area/projects: MLproject format, project running backends
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/server-infra: MLflow Tracking server backend
  • area/tracking: Tracking Service, tracking client APIs, autologging

What interface(s) does this bug affect?

  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
  • area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • area/windows: Windows support

What language(s) does this bug affect?

  • language/r: R APIs and clients
  • language/java: Java APIs and clients
  • language/new: Proposals for new client languages

What integration(s) does this bug affect?

  • integrations/azure: Azure and Azure ML integrations
  • integrations/sagemaker: SageMaker integrations
  • integrations/databricks: Databricks integrations
@lewinfox lewinfox added the enhancement New feature or request label Nov 10, 2023
@github-actions github-actions bot added area/models MLmodel format, model serialization/deserialization, flavors language/r R APIs and clients labels Nov 10, 2023
@BenWilson2
Copy link
Member

Hi @lewinfox there isn't an awful lot of development attention on the R front for MLflow. We would gladly accept contributions that could enhance the experience for advanced R users such as yourself.

Is the intention to create something akin to a custom Rfunc implementation that permits alternative serialization? This really seems like it would be a valuable addition to the core implementation.

Could you draft up a PR with your proposed implementation and we can discuss how to integrate this functionality directly into the R implementation of MLflow?

Copy link

@mlflow/mlflow-team Please assign a maintainer and start triaging this issue.

@lewinfox
Copy link
Author

@BenWilson2 absolutely! Would love to. We are winding down for the Xmas break so hopefully I'll have some time in the next couple of weeks.

@lewinfox
Copy link
Author

@BenWilson2 Just getting my dev env set up but I'm failing the must-have-signoff pre-commit hook. What's the correct process to pass this check at this stage of development? Should I just ignore the hooks or should I include "Signed-off-by:" in my messages? Couldn't see a reference to this in CONTRIBUTING.md. Thanks!

@lewinfox lewinfox linked a pull request Nov 26, 2023 that will close this issue
37 tasks
@github-actions github-actions bot added the has-closing-pr This issue has a closing PR label Nov 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/models MLmodel format, model serialization/deserialization, flavors enhancement New feature or request has-closing-pr This issue has a closing PR language/r R APIs and clients
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants