Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Machine learning model template #7

Open
Nolski opened this issue Jul 29, 2019 · 1 comment

Comments

@Nolski
Copy link
Collaborator

commented Jul 29, 2019

Summary

Teams are hesitant to release their machine learning code open source as they feel it's not as simple as other application code. It would be nice if we had a succinct template to

Background

This stems from my talks with MacroEyes and Har Zindagi. They both are wary about releasing their machine learning models as they might be used improperly or not understood correctly. I think we could really provide a useful resource that hasn't been used beyond more than one project yet.

I was reading a white paper a while back which researched this quite a bit. Perhaps we could also adapt the publiccode.yml document created by the public code network?

Outcome

I think a great outcome for this would be a README template or at very least a how to use document you can embed in the repository/documentation. Something easily accessible like a markdown file

@Nolski Nolski self-assigned this Jul 29, 2019

@Nolski Nolski added this to In progress in LibreCorps Master Tracker Aug 4, 2019

@Nolski

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 5, 2019

Here's an initial pass, pulling from the above paper.

Model Title

Model Details.

Basic information about the model.

  • Person or organization developing model
  • Model date
  • Model version
  • Model type
  • Information about training algorithms, parameters, fairness constraints or other applied approaches, and features
  • Paper or other resource for more information
  • Citation details
  • License
  • Where to send questions or comments about the model

Intended Use

Use cases that were envisioned during development.

  • Primary intended uses
  • Primary intended users
  • Out-of-scope use cases

Factors

Factors could include demographic or phenotypic groups, environmental conditions, technical attributes, or others.

  • Relevant factors
  • Evaluation factors

Metrics

Metrics should be chosen to reflect potential real-world impacts of the model.

  • Model performance measures
  • Decision thresholds
  • Variation approaches

Evaluation Data

Details on the dataset(s) used for the quantitative analyses in the card.

  • Datasets
  • Motivation
  • Preprocessing

Training Data.

May not be possible to provide in practice. When possible, this section should mirror Evaluation Data. If such detail is not possible, minimal allowable information should be provided here, such as details of the distribution over various factors in the training datasets.

Quantitative Analyses

  • Unitary results
  • Intersectional results

Ethical Considerations

Caveats and Recommendations

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
1 participant
You can’t perform that action at this time.