Skip to content

docs: add dataset license files and readme#159

Merged
asteier2026 merged 2 commits into
mainfrom
asteier2026/docs/licenses
May 15, 2026
Merged

docs: add dataset license files and readme#159
asteier2026 merged 2 commits into
mainfrom
asteier2026/docs/licenses

Conversation

@asteier2026
Copy link
Copy Markdown
Contributor

Changes include:

  • Added dataset license files for TAB_legal_sample25.csv and NVIDIA_synthetic_biographies.csv, along with a README.

@asteier2026 asteier2026 requested a review from a team as a code owner May 15, 2026 16:04
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 15, 2026

Greptile Summary

This PR adds license files and a README for the two demo datasets (NVIDIA_synthetic_biographies.csv and TAB_legal_sample25.csv), addressing attribution and licensing disclosure gaps that were previously flagged.

  • NVIDIA_synthetic_biographies.LICENSE documents CC BY 4.0 with a proper attribution block (NVIDIA Corporation, 2026) and a link to the canonical license.
  • TAB_legal_sample25.LICENSE documents CC BY-NC 4.0 with attribution to Pilan et al. (2022) and the upstream TAB repository URL.
  • docs/data/README.md introduces a dataset table with license links and adds a prominent blockquote warning that the TAB dataset must not be used commercially.

Confidence Score: 5/5

Documentation-only changes that add license files and a README; no executable code is modified.

All three new files are purely documentation. The previously raised attribution gaps are addressed with attribution blocks in both LICENSE files, and the commercial-use restriction for the TAB dataset is surfaced prominently in the README. No logic, APIs, or data pipelines are touched.

No files require special attention.

Important Files Changed

Filename Overview
docs/data/NVIDIA_synthetic_biographies.LICENSE CC BY 4.0 license file with attribution block (NVIDIA Corporation, 2026) and link to canonical license; addresses the previously raised attribution gap.
docs/data/TAB_legal_sample25.LICENSE CC BY-NC 4.0 license file with attribution block (Pilan et al., 2022, TAB repo URL); addresses the previously raised attribution gap.
docs/data/README.md Adds dataset listing table with license links; includes a prominent blockquote warning users that TAB_legal_sample25.csv must not be used commercially.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[docs/data/] --> B[NVIDIA_synthetic_biographies.csv]
    A --> C[TAB_legal_sample25.csv]
    B --> D[NVIDIA_synthetic_biographies.LICENSE\nCC BY 4.0\nNVIDIA Corporation, 2026]
    C --> E[TAB_legal_sample25.LICENSE\nCC BY-NC 4.0\nPilan et al., 2022]
    D --> F[✅ Commercial use permitted]
    E --> G[⚠️ Non-commercial only\nWarning added to README]
    A --> H[README.md\nDataset table + commercial-use notice]
Loading

Reviews (2): Last reviewed commit: "fix: adjust license files" | Re-trigger Greptile

Comment thread docs/data/NVIDIA_synthetic_biographies.LICENSE
Comment thread docs/data/README.md Outdated
Copy link
Copy Markdown
Collaborator

@lipikaramaswamy lipikaramaswamy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢

@lipikaramaswamy
Copy link
Copy Markdown
Collaborator

Assuming the guidance is to have .LICENSE files?

@asteier2026
Copy link
Copy Markdown
Contributor Author

Additional license info added.

@asteier2026 asteier2026 merged commit 78f550f into main May 15, 2026
11 checks passed
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 15, 2026

Want your agent to iterate on Greptile's feedback? Try greploops.

@asteier2026 asteier2026 deleted the asteier2026/docs/licenses branch May 15, 2026 16:19
asteier2026 added a commit that referenced this pull request May 15, 2026
* feature: add dataset license files

* fix: adjust license files
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants