Skip to content

DEV - Create HuggingFace Repo For New Years #2078

@rayneng

Description

@rayneng

Overview

We need to create a script that, when run in GitHub Actions, creates a new HuggingFace repository for the new year. This will eliminate the manual process of dev lead creating a 311-data/YYYY repository every year, which often results in failing GitHub Action pipeline builds.

More Info

311 Devs are given token access to the 311-data-dev HF account, but they do not have token access to 311-data HF account. Therefore, they cannot manually create repositories in our prod storage repos. The prod tokens are available in the 311 Data GitHub repository's secrets, and can be utilized in GitHub Actions (e.g. via Python script).

Action Items

  • Modify the updateHfDataset.py script to make a new HuggingFace repository for the current year
    • do this programmatically, do not hard-code the year
    • if the repo already exists, the operation should be a noop
  • Test the GitHub Action locally (see README for instructions)
    • provide screenshot in the comments showing that the new 2026 repository was created in huggingface.co/311-data-dev as a result of running the workflow
  • Create PR and merge code
  • Run workflow manually in the main 311 Data repository
    • provide screenshot in comments showing that the new 2026 repository was created in huggingface.co/311-data as a result of running the workflow

Resources/Instructions

HuggingFace Links

Screenshots

Screenshot: 311-data-dev with no 2026 repository

Taken on 2026-02-18 @ 10:40 AM PST

  • use this as evidence that the 2026 repo did not exist in the dev HF repo prior to this date and time
Image

Screenshot: 311-data (prod) with no 2026 repository

Taken on 2026-02-18 @ 10:43 AM PST

  • use this as evidence that the 2026 repo did not exist in the prod HF repo prior to this date and time
Image

Metadata

Metadata

Assignees

Labels

Complexity: Mediumrequire research/investigation before completing; internal team info/input or external team questionFeature: Code HealthMake our code more readable, testable, and modularRole: DevOpsinfrastructure, CI or related workp-feature: Analyticscollection & study of data on how people use the productsize: 1ptCan be done in 6 hours

Type

No type

Projects

Status

Done

Relationships

None yet

Development

No branches or pull requests

Issue actions