Skip to content

Latest commit

 

History

History
106 lines (78 loc) · 6.97 KB

OpenDataology.md

File metadata and controls

106 lines (78 loc) · 6.97 KB

OpenDataology Project Proposal

Name of proposed project:

OpenDataology

Requested project maturity:

Sandbox

Project description:

OpenDataology is an open source dataset license compliance analysis project. Our project enables users of publicly available datasets and users who curate a dataset from multiple data sources (particularly for use as a part of machine learning models) identify the potential license compliance risks. Our project is primarily comprised of three key components.

  • A dataset license compliance analysis workflow that ascertains the final allowed rights and the required obligations associated with using a publicly avialable dataset or a dataset that is curated from multiple data sources for any purpose.
  • A growing database and a web portal that documents the final rights and obligations (after the license compliance analysis is conducted) associated with the datasets and the data sources analyzed in our project. The database also documents the metadata collected and used to conduct the compliance workflow
  • An online license generation toolkit that creators of dataset to generate custom licenses depending on the exact rights and obligations that they want to allow (instead of having to rely of existing available and limited dataset specific licenses)

Statement on alignment with LF AI’s mission:

Publicly available datasets are at the heart of Open AI and machine learning software and models. Using the publicly available datsest compliantly will be one of the key aspects in enabling LF-AI's mission to build and support an open artificial intelligence (AI) and data community. This project will always remain open, transparent and accessible to both users and contributers alike.

Describe identified possible collaboration opportunities with current LF AI hosted projects:

We have identified collaboration opportunities with the following LF-AI projects

  • OpenBytes
  • OpenLineage
  • OpenDS4All

In addition we also currently collaborate with LF product SPDX.

License name, version, and URL to the license text:

MIT License, https://opensource.org/licenses/MIT

URL to location of source code (GitHub, etc.):

The URL to location of the source code is (github): https://github.com/OpenDataology/OpenDataology

Does the project sits in its own GH organization?

Yes - https://github.com/OpenDataology
*Do you have the GH DCO app active in the repos?
Yes

Issue tracker (GitHub, JIRA, etc) - Please confirm tools in use.

We use the GitHub Issue tracker. Our issue repo can be found at: https://github.com/OpenDataology/OpenDataology/issues

Collaboration tools (mailing lists, wiki, IRC, Slack, Glitter, etc.) - Please confirm tools in use:

External dependencies including licenses (name and version) of those dependencies.

There are no external dependencies in our project.

Initial committers (name, email, organization) and how long have they been working on project?

Initial contributors (name, email, organization) and how long have they been working on project?

Have the project defined the roles of contributor, committer, maintainer, etc.? Please document it in MAINTAINERS.md.

Our contributers, committers and maintainers can be found here: https://github.com/OpenDataology/OpenDataology/blob/main/CONTRIBUTORS.md

Total number of contributors to the project including their affiliations at the time of submitting this proposal:

We have 8 contributors to the project. Our contributers come from Huawei Canada, Huawei China, Grandall China, York University Canada and University of Victoria Canada

Does the project have a release methodology? Please document it in RELEASES.md.

No

Does the project have a code of conduct? If yes, please share the URL. If please created CODE_OF_CONDUCT.md and point to https://lfprojects.org/policies/code-of-conduct/. You can use conduct@lfai.foundation as email for contact on this topic.

Our code of conduct can be found here: https://github.com/OpenDataology/OpenDataology/blob/main/CODE_OF_CONDUCT.md

Do you have any specific infrastructure requests needed as part of the project in the LF AI?

Confluence wiki

Project website - Do you have a web site? If no, did you reserve a and would like you to have a website created?

No, we have researved a domain name (OpenDataology.com). However, we haven't created a website yet. We will create the website in the near future.
https://github.com/OpenDataology/OpenDataology

Project governance - Do you have a working governance model for project? Please provide URL to where it is documented, typically GOVERNANCE.md.

Yes. Our governance model can be found here: https://github.com/OpenDataology/OpenDataology/blob/main/GOVERNANCE.md

• Social media accounts - Do you have any Twitter/LinkedIn/Facebook/etc. project accounts? Please provide pointers.

Twitter: @OpenDataology

Existing sponsorship (e.g., whether any organization has provided funding or other support to date, and a description of that support), if any.

The project has not received any external funding.