Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: SkalskiP/star-track
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: master
Choose a base ref
...
head repository: roboflow/star-track
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: master
Choose a head ref
Able to merge. These branches can be automatically merged.

Commits on Sep 16, 2024

  1. Copy the full SHA
    88769ff View commit details
  2. update Star Tracker workflow

    SkalskiP committed Sep 16, 2024
    Copy the full SHA
    138f378 View commit details
  3. update Star Tracker workflow

    SkalskiP committed Sep 16, 2024
    Copy the full SHA
    6f6878d View commit details
  4. Copy the full SHA
    084fc77 View commit details
  5. Copy the full SHA
    5e78192 View commit details
  6. remove data.csv

    SkalskiP committed Sep 16, 2024
    Copy the full SHA
    3a52b09 View commit details
  7. test new GH action

    SkalskiP committed Sep 16, 2024
    Copy the full SHA
    516a042 View commit details
  8. ci: 👷 args inputs added to action.yml

    Signed-off-by: Onuralp SEZER <thunderbirdtr@gmail.com>
    onuralpszr committed Sep 16, 2024

    Verified

    This commit was signed with the committer’s verified signature.
    onuralpszr Onuralp SEZER
    Copy the full SHA
    d72f9af View commit details
  9. ci(docker): 🐳 dockerfile workdir adjusment

    Signed-off-by: Onuralp SEZER <thunderbirdtr@gmail.com>
    onuralpszr committed Sep 16, 2024

    Verified

    This commit was signed with the committer’s verified signature.
    onuralpszr Onuralp SEZER
    Copy the full SHA
    c0a7b50 View commit details
  10. ci: 👷 switch to master repo for star-track action

    Signed-off-by: Onuralp SEZER <thunderbirdtr@gmail.com>
    onuralpszr committed Sep 16, 2024

    Verified

    This commit was signed with the committer’s verified signature.
    onuralpszr Onuralp SEZER
    Copy the full SHA
    625bdde View commit details
  11. fix: 🐞 import fix for app.py

    Signed-off-by: Onuralp SEZER <thunderbirdtr@gmail.com>
    onuralpszr committed Sep 16, 2024

    Verified

    This commit was signed with the committer’s verified signature.
    onuralpszr Onuralp SEZER
    Copy the full SHA
    902f0a9 View commit details
  12. fix: 🐞 github_token env added

    Signed-off-by: Onuralp SEZER <thunderbirdtr@gmail.com>
    onuralpszr committed Sep 16, 2024

    Verified

    This commit was signed with the committer’s verified signature.
    onuralpszr Onuralp SEZER
    Copy the full SHA
    6fbe8a2 View commit details
  13. fix: 🐞 push access to workflow for csv test

    Signed-off-by: Onuralp SEZER <thunderbirdtr@gmail.com>
    onuralpszr committed Sep 16, 2024

    Verified

    This commit was signed with the committer’s verified signature.
    onuralpszr Onuralp SEZER
    Copy the full SHA
    9d05471 View commit details
  14. Update star data

    onuralpszr authored and github-actions[bot] committed Sep 16, 2024
    Copy the full SHA
    4f720be View commit details
  15. chore: complete gitignore

    Signed-off-by: Onuralp SEZER <thunderbirdtr@gmail.com>
    onuralpszr committed Sep 16, 2024

    Verified

    This commit was signed with the committer’s verified signature.
    onuralpszr Onuralp SEZER
    Copy the full SHA
    752963e View commit details
  16. ci: 👷 small input clean up and dev action added

    Signed-off-by: Onuralp SEZER <thunderbirdtr@gmail.com>
    onuralpszr committed Sep 16, 2024

    Verified

    This commit was signed with the committer’s verified signature.
    onuralpszr Onuralp SEZER
    Copy the full SHA
    15942c2 View commit details
  17. ci: 👷 single repo test

    Signed-off-by: Onuralp SEZER <thunderbirdtr@gmail.com>
    onuralpszr committed Sep 16, 2024

    Verified

    This commit was signed with the committer’s verified signature.
    onuralpszr Onuralp SEZER
    Copy the full SHA
    59fe89e View commit details
  18. ci: 👷 dev action clean up

    Signed-off-by: Onuralp SEZER <thunderbirdtr@gmail.com>
    onuralpszr committed Sep 16, 2024

    Verified

    This commit was signed with the committer’s verified signature.
    onuralpszr Onuralp SEZER
    Copy the full SHA
    1edfb59 View commit details

Commits on Sep 17, 2024

  1. Update star data

    SkalskiP authored and github-actions[bot] committed Sep 17, 2024
    Copy the full SHA
    feba82b View commit details
  2. fix: add GH api pagination

    SkalskiP committed Sep 17, 2024
    Copy the full SHA
    fad560b View commit details
  3. Verified

    This commit was signed with the committer’s verified signature.
    onuralpszr Onuralp SEZER
    Copy the full SHA
    86618e3 View commit details
  4. chore: remove ls listing

    Signed-off-by: Onuralp SEZER <thunderbirdtr@gmail.com>
    onuralpszr committed Sep 17, 2024

    Verified

    This commit was signed with the committer’s verified signature.
    onuralpszr Onuralp SEZER
    Copy the full SHA
    276bc51 View commit details
  5. docs: 📝 small readme update

    Signed-off-by: Onuralp SEZER <thunderbirdtr@gmail.com>
    onuralpszr committed Sep 17, 2024

    Verified

    This commit was signed with the committer’s verified signature.
    onuralpszr Onuralp SEZER
    Copy the full SHA
    a515276 View commit details

Commits on Sep 18, 2024

  1. Update star data

    SkalskiP authored and github-actions[bot] committed Sep 18, 2024
    Copy the full SHA
    b0e5292 View commit details

Commits on Sep 19, 2024

  1. Update star data

    SkalskiP authored and github-actions[bot] committed Sep 19, 2024
    Copy the full SHA
    e7dd6f9 View commit details

Commits on Sep 20, 2024

  1. Update star data

    SkalskiP authored and github-actions[bot] committed Sep 20, 2024
    Copy the full SHA
    f29af83 View commit details

Commits on Sep 21, 2024

  1. Update star data

    SkalskiP authored and github-actions[bot] committed Sep 21, 2024
    Copy the full SHA
    2ad7b76 View commit details

Commits on Sep 22, 2024

  1. Update star data

    SkalskiP authored and github-actions[bot] committed Sep 22, 2024
    Copy the full SHA
    80c05aa View commit details

Commits on Sep 23, 2024

  1. Update star data

    SkalskiP authored and github-actions[bot] committed Sep 23, 2024
    Copy the full SHA
    710577e View commit details

Commits on Sep 24, 2024

  1. Update star data

    SkalskiP authored and github-actions[bot] committed Sep 24, 2024
    Copy the full SHA
    0ddf591 View commit details

Commits on Sep 25, 2024

  1. Update star data

    SkalskiP authored and github-actions[bot] committed Sep 25, 2024
    Copy the full SHA
    9c4ceb0 View commit details

Commits on Sep 26, 2024

  1. Update star data

    SkalskiP authored and github-actions[bot] committed Sep 26, 2024
    Copy the full SHA
    769d446 View commit details
  2. feat: 🚀 add .dockerignore file to exclude unnecessary files from Dock…

    …er context
    
    chore: 🔧 update GitHub Actions workflows
      - Add pull_request trigger to dev-star-tracker.yml
      - Update organizations list in star-tracker.yml
    feat: 🚀 add .pre-commit-config.yaml for pre-commit hooks configuration
    chore: 🐍 update Dockerfile to use Python 3.12-slim
    docs: 📚 fix formatting issues in README.md
    fix: 🧹 remove trailing spaces in action.yml
    refactor: ♻️ improve concurrency and add output path configuration in app.py
    feat: 🚀 add output path and filename environment variables in config.py
    refactor: ♻️ improve repository fetching and add request timeout in core.py
    
    Signed-off-by: Onuralp SEZER <thunderbirdtr@gmail.com>
    onuralpszr committed Sep 26, 2024

    Verified

    This commit was signed with the committer’s verified signature.
    onuralpszr Onuralp SEZER
    Copy the full SHA
    9600999 View commit details
  3. Merge pull request #1 from roboflow/feat/refactors

    refactor: ♻️ improve repository code base and config and more orgs added.
    onuralpszr authored Sep 26, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    e14ee1d View commit details
  4. Update star data

    onuralpszr authored and github-actions[bot] committed Sep 26, 2024
    Copy the full SHA
    d6354ce View commit details
  5. ci: 👷 cron ci renamed

    Signed-off-by: Onuralp SEZER <thunderbirdtr@gmail.com>
    onuralpszr committed Sep 26, 2024

    Verified

    This commit was signed with the committer’s verified signature.
    onuralpszr Onuralp SEZER
    Copy the full SHA
    f658a23 View commit details

Commits on Sep 27, 2024

  1. Update star data

    SkalskiP authored and github-actions[bot] committed Sep 27, 2024
    Copy the full SHA
    9c0193b View commit details

Commits on Sep 28, 2024

  1. Update star data

    SkalskiP authored and github-actions[bot] committed Sep 28, 2024
    Copy the full SHA
    2c4429a View commit details

Commits on Sep 29, 2024

  1. Update star data

    SkalskiP authored and github-actions[bot] committed Sep 29, 2024
    Copy the full SHA
    01da1e4 View commit details

Commits on Sep 30, 2024

  1. Update star data

    SkalskiP authored and github-actions[bot] committed Sep 30, 2024
    Copy the full SHA
    981ddd6 View commit details

Commits on Oct 1, 2024

  1. Update star data

    SkalskiP authored and github-actions[bot] committed Oct 1, 2024
    Copy the full SHA
    dea9371 View commit details

Commits on Oct 2, 2024

  1. Update star data

    SkalskiP authored and github-actions[bot] committed Oct 2, 2024
    Copy the full SHA
    41748d7 View commit details

Commits on Oct 3, 2024

  1. Update star data

    SkalskiP authored and github-actions[bot] committed Oct 3, 2024
    Copy the full SHA
    2706549 View commit details

Commits on Oct 4, 2024

  1. Update star data

    SkalskiP authored and github-actions[bot] committed Oct 4, 2024
    Copy the full SHA
    96ae9ed View commit details

Commits on Oct 5, 2024

  1. Update star data

    SkalskiP authored and github-actions[bot] committed Oct 5, 2024
    Copy the full SHA
    1df217b View commit details

Commits on Oct 6, 2024

  1. Update star data

    SkalskiP authored and github-actions[bot] committed Oct 6, 2024
    Copy the full SHA
    d0c293c View commit details

Commits on Oct 7, 2024

  1. Update star data

    SkalskiP authored and github-actions[bot] committed Oct 7, 2024
    Copy the full SHA
    48f517b View commit details

Commits on Oct 8, 2024

  1. Update star data

    SkalskiP authored and github-actions[bot] committed Oct 8, 2024
    Copy the full SHA
    5e46abf View commit details

Commits on Oct 9, 2024

  1. Update star data

    SkalskiP authored and github-actions[bot] committed Oct 9, 2024
    Copy the full SHA
    bf67718 View commit details

Commits on Oct 10, 2024

  1. Update star data

    SkalskiP authored and github-actions[bot] committed Oct 10, 2024
    Copy the full SHA
    b22898f View commit details
Showing with 895 additions and 99 deletions.
  1. +41 −0 .dockerignore
  2. +29 −0 .github/workflows/dev-star-tracker.yml
  3. +31 −0 .github/workflows/star-tracker.yml
  4. +164 −1 .gitignore
  5. +33 −0 .pre-commit-config.yaml
  6. +11 −0 Dockerfile
  7. +52 −21 README.md
  8. +15 −0 action.yml
  9. +1 −1 config.json
  10. +177 −2 data/data.csv
  11. +121 −0 pyproject.toml
  12. +1 −1 requirements.txt
  13. +115 −47 startrack/app.py
  14. +7 −1 startrack/config.py
  15. +97 −25 startrack/core.py
41 changes: 41 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
__pycache__/
*.py[cod]
*$py.class

.Python
build/
develop-eggs/
dist/
downloads/
data/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

.python-version

.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/


.mypy_cache/
.dmypy.json
dmypy.json

.pre-commit-config.yaml
.pre-commit-hooks.yaml
29 changes: 29 additions & 0 deletions .github/workflows/dev-star-tracker.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
name: 🌟 Star Tracker - Develop

on:
push:
branches:
- master
pull_request:
branches:
- '**'
workflow_dispatch:

jobs:
star-tracker-dev:
runs-on: ubuntu-latest

steps:
- name: 🛠️ Checkout Repository
uses: actions/checkout@v4

- name: 🌠 Track Stars
uses: ./
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
organizations: 'roboflow, autodistill, voxel51'

- name: 📊 Show data.csv
run: |
cat data/data.csv
31 changes: 31 additions & 0 deletions .github/workflows/star-tracker.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
name: 🌟 Star Tracker - Master/Cron

on:
schedule:
- cron: '0 0 * * *' # Runs daily at midnight
workflow_dispatch:

permissions:
contents: write # Grants push access


jobs:
track-stars:
runs-on: ubuntu-latest

steps:
- name: Checkout Repository
uses: actions/checkout@v4

- name: Track Stars
uses: roboflow/star-track@master
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
organizations: 'roboflow, autodistill, huggingface, voxel51, ultralytics, Lightning-AI'

- name: Commit Data
uses: stefanzweifel/git-auto-commit-action@v4
with:
commit_message: Update star data
file_pattern: data/data.csv
165 changes: 164 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,165 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/latest/usage/project/#working-with-version-control
.pdm.toml
.pdm-python
.pdm-build/

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
.idea/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
.idea/

# OSX
.DS_Store
33 changes: 33 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@

ci:
autofix_prs: true
autoupdate_schedule: weekly
autofix_commit_msg: "fix(pre_commit): 🎨 auto format pre-commit hooks"
autoupdate_commit_msg: "chore(pre_commit): ⬆ pre_commit autoupdate"

repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.6.0
hooks:
- id: trailing-whitespace
exclude: test/.*\.py
- id: check-yaml
exclude: mkdocs.yml
- id: check-executables-have-shebangs
- id: check-toml
- id: check-case-conflict
- id: check-added-large-files
- id: detect-private-key
- id: pretty-format-json
exclude: demo.ipynb
args: ['--autofix', '--no-sort-keys', '--indent=4']
- id: end-of-file-fixer
- id: mixed-line-ending

- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.6.7
hooks:
- id: ruff
args: [--fix, --exit-non-zero-on-fix]
- id: ruff-format
types_or: [ python, pyi, jupyter ]
11 changes: 11 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
FROM python:3.12-slim

COPY startrack/ /startrack/
COPY requirements.txt .

RUN pip install --no-cache-dir -r requirements.txt
RUN rm -rf /root/.cache/pip

WORKDIR /
ENV PYTHONPATH="/startrack"
ENTRYPOINT ["python","-m","startrack.app"]
73 changes: 52 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,39 +1,70 @@
<h1 align="center">Star-Track </h1>

<p align="center">
</br>
<img width="400" src="https://github.com/SkalskiP/star-track/assets/26109316/b643f69f-5a52-42b6-a54d-b22132beb5ee" alt="star track logo">
</br>
</p>


## 👋 hello

Star-Track is a user-friendly utility for tracking GitHub repository statistics.
Star-Track is a user-friendly utility for tracking GitHub repository statistics.

## 💻 install

- clone repository
- clone repositoryą

```bash
git clone https://github.com/SkalskiP/star-track.git
```

- setup python environment and activate it [optional]
```bash
git clone https://github.com/roboflow/star-track.git
```

```bash
python3 -m venv venv
source venv/bin/activate
```
- setup python environment and activate it \[optional\]

```bash
python3 -m venv venv
source venv/bin/activate
```

- install required dependencies

```bash
pip install -r requirements.txt
```
```bash
pip install -r requirements.txt
```

## ⚙️ execute

```bash
python -m startrack.app
```

## 🐳 Docker

To test the Docker solution locally, follow these steps:

1. **Build the Docker Image**

```bash
docker build -t startrack:latest .
```

2. **Run the Docker Container**

```bash
docker run --rm \
-e GITHUB_TOKEN=your_github_token \
-e INPUT_ORGANIZATIONS=org1,org2 \
-e INPUT_REPOSITORIES=user1/repo1,user2/repo2 \
-v $(pwd)/data:/app/data:z \
startrack:latest
```

### Explanation

- `--rm`: Automatically remove the container when it exits.
- `-e GITHUB_TOKEN=your_github_token`: Set the `GITHUB_TOKEN` environment variable.
- `-e INPUT_ORGANIZATIONS=org1,org2`: Set the `INPUT_ORGANIZATIONS` environment variable.
- `-e INPUT_REPOSITORIES=user1/repo1,user2/repo2`: Set the `INPUT_REPOSITORIES` environment variable.
- `-v $(pwd)/data:/app/data`: Mount the `data` directory from your current working directory to the `/app/data` directory in the container. This allows you to access the output CSV file on your host machine.
- `startrack:latest`: The name of the Docker image to run.

## 📝 Notes

- **Ensure GitHub Token Permissions**: Make sure your GitHub token has the necessary permissions to access the repositories you want to track.
- **Data Directory**: Ensure the `data` directory exists in your current working directory or Docker will create it for you.
- **Environment Variables**: Adjust the environment variables as needed to match your specific use case.

By following these steps, you can test your Docker solution locally and ensure that your GitHub Action will work as expected. If you encounter any issues or need further assistance, feel free to ask!
15 changes: 15 additions & 0 deletions action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
name: 'Star Tracker'
description: 'Track star counts for specified GitHub repositories and organizations.'
inputs:
organizations:
description: 'Comma-separated list of organization names'
required: false
repositories:
description: 'Comma-separated list of repository full names (owner/repo)'
required: false
runs:
using: 'docker'
image: 'Dockerfile'
args:
- ${{ inputs.organizations }}
- ${{ inputs.repositories }}
2 changes: 1 addition & 1 deletion config.json
Original file line number Diff line number Diff line change
@@ -2,4 +2,4 @@
"organizations": [
"roboflow"
]
}
}
179 changes: 177 additions & 2 deletions data/data.csv

Large diffs are not rendered by default.

121 changes: 121 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
[tool.ruff]
target-version = "py312"

exclude = [
".bzr",
".direnv",
".eggs",
".git",
".git-rewrite",
".hg",
".mypy_cache",
".nox",
".pants.d",
".pytype",
".ruff_cache",
".svn",
".tox",
".venv",
"__pypackages__",
"_build",
"buck-out",
"build",
"dist",
"node_modules",
"venv",
"yarn-error.log",
"yarn.lock",
"docs",
]

line-length = 89
indent-width = 4

[tool.ruff.lint]
# Enable pycodestyle (`E`) and Pyflakes (`F`) codes by default.
select = ["E", "F", "I", "A", "Q", "W","RUF","UP","YTT","NPY","ANN","T","S","Q","N","G","F","E","C","B","A"]
ignore = []
# Allow autofix for all enabled rules (when `--fix`) is provided.
fixable = [
"A",
"B",
"C",
"D",
"E",
"F",
"G",
"I",
"N",
"Q",
"S",
"T",
"W",
"ANN",
"ARG",
"BLE",
"COM",
"DJ",
"DTZ",
"EM",
"ERA",
"EXE",
"FBT",
"ICN",
"INP",
"ISC",
"NPY",
"PD",
"PGH",
"PIE",
"PL",
"PT",
"PTH",
"PYI",
"RET",
"RSE",
"RUF",
"SIM",
"SLF",
"TCH",
"TID",
"TRY",
"UP",
"YTT",
]
unfixable = []
# Allow unused variables when underscore-prefixed.
dummy-variable-rgx = "^(_+|(_+[a-zA-Z0-9_]*[a-zA-Z0-9]+?))$"
pylint.max-args = 20

[tool.ruff.lint.flake8-quotes]
inline-quotes = "double"
multiline-quotes = "double"
docstring-quotes = "double"

[tool.ruff.lint.pydocstyle]
convention = "google"

[tool.ruff.lint.per-file-ignores]
"__init__.py" = ["E402", "F401"]

[tool.ruff.lint.mccabe]
# Flag errors (`C901`) whenever the complexity level exceeds 5.
max-complexity = 20

[tool.ruff.lint.isort]
order-by-type = true
no-sections = false

[tool.ruff.format]
# Like Black, use double quotes for strings.
quote-style = "double"
docstring-code-format = true

# Like Black, indent with spaces, rather than tabs.
indent-style = "space"

# Like Black, respect magic trailing commas.
skip-magic-trailing-comma = false

# Like Black, automatically detect the appropriate line ending.
line-ending = "auto"
2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
pandas
requests
requests
162 changes: 115 additions & 47 deletions startrack/app.py
Original file line number Diff line number Diff line change
@@ -1,23 +1,37 @@
import concurrent.futures
import os
from datetime import datetime
from typing import List
from pathlib import Path

import pandas as pd

from startrack.config import GITHUB_TOKEN_ENV
from startrack.config import (
GITHUB_TOKEN_ENV,
INPUT_ORGANIZATIONS_ENV,
INPUT_OUTPUT_FILENAME_ENV,
INPUT_OUTPUT_PATH_ENV,
INPUT_REPOSITORIES_ENV,
)
from startrack.core import (
list_organization_repositories,
RepositoryType,
RepositoryData,
to_dataframe
RepositoryType,
convert_repositories_to_dataframe,
fetch_all_organization_repositories,
fetch_repository_data_by_full_name,
)

GITHUB_TOKEN = os.environ.get(GITHUB_TOKEN_ENV, None)
ORGANIZATION_NAMES = ["roboflow", "autodistill"]
GITHUB_TOKEN = os.environ.get(GITHUB_TOKEN_ENV)
ORGANIZATIONS = os.environ.get(INPUT_ORGANIZATIONS_ENV)
REPOSITORIES = os.environ.get(INPUT_REPOSITORIES_ENV, "")
OUTPUT_PATH = os.environ.get(INPUT_OUTPUT_PATH_ENV, "data")
OUTPUT_FILENAME = os.environ.get(INPUT_OUTPUT_FILENAME_ENV, "data.csv")

ORGANIZATION_NAMES = [org.strip() for org in ORGANIZATIONS.split(",") if org.strip()]
REPOSITORY_NAMES = [repo.strip() for repo in REPOSITORIES.split(",") if repo.strip()]


def save_to_csv(df: pd.DataFrame, directory: str, filename: str) -> None:
"""
Save a DataFrame to a CSV file in the specified directory.
"""Save a DataFrame to a CSV file in the specified directory.
Args:
df (pd.DataFrame): The DataFrame to save.
@@ -31,46 +45,100 @@ def save_to_csv(df: pd.DataFrame, directory: str, filename: str) -> None:
df.to_csv(file_path)


def get_all_organization_repositories(
github_token: str,
organization_name: str,
repository_type: RepositoryType
) -> List:
def fetch_organization_repositories(organization_name: str) -> list[RepositoryData]:
repos = fetch_all_organization_repositories(
github_token=GITHUB_TOKEN,
organization_name=organization_name,
repository_type=RepositoryType.PUBLIC,
)
return [RepositoryData.from_json(repo) for repo in repos]


def fetch_individual_repository(repo_full_name: str) -> RepositoryData:
repo_data = fetch_repository_data_by_full_name(
github_token=GITHUB_TOKEN,
repository_full_name=repo_full_name,
)
if repo_data:
return RepositoryData.from_json(repo_data)
return None


def get_all_repositories() -> list[RepositoryData]:
"""Fetch all repositories from specified organizations and individual repositories.
Returns:
List[RepositoryData]: A list of repository data objects.
"""
all_repositories = []
page = 1
while True:
repos = list_organization_repositories(
github_token=github_token,
organization_name=organization_name,
repository_type=repository_type,
page=page)
if not repos:
break
all_repositories.extend(repos)
page += 1

with concurrent.futures.ThreadPoolExecutor() as executor:
# Fetch repositories from specified organizations in parallel
organization_futures = [
executor.submit(fetch_organization_repositories, org_name)
for org_name in ORGANIZATION_NAMES
]

# Fetch specified repositories in parallel
repository_futures = [
executor.submit(fetch_individual_repository, repo_name)
for repo_name in REPOSITORY_NAMES
]

# Collect results from organization futures
for future in concurrent.futures.as_completed(organization_futures):
all_repositories.extend(future.result())

# Collect results from repository futures
for future in concurrent.futures.as_completed(repository_futures):
repo_data = future.result()
if repo_data:
all_repositories.append(repo_data)

return all_repositories


all_repositories_json = []
for organization_name in ORGANIZATION_NAMES:
repositories_json = get_all_organization_repositories(
github_token=GITHUB_TOKEN,
organization_name=organization_name,
repository_type=RepositoryType.PUBLIC)
all_repositories_json.extend(repositories_json)

repositories = [
RepositoryData.from_json(repository_json)
for repository_json
in all_repositories_json]
df = to_dataframe(repositories)
df = df.set_index('full_name').T

current_date = datetime.now().strftime("%Y-%m-%d")
df.index = [current_date]

save_to_csv(
df=df,
directory='data',
filename='data.csv')
def main() -> None:
"""
"Main function to fetch repository data, update the DataFrame, and save it to a CSV file.
""" # noqa: E501

if not GITHUB_TOKEN:
msg = (
"`GITHUB_TOKEN` is not set. Please set the `GITHUB_TOKEN` environment "
"variable."
)
raise ValueError(
msg,
)
if not ORGANIZATION_NAMES and not REPOSITORY_NAMES:
msg = (
"Either `ORGANIZATION_NAMES` or `REPOSITORY_NAMES` must be set. Please "
"provide at least one organization name or repository name."
)
raise ValueError(
msg,
)

repositories = get_all_repositories()
df = convert_repositories_to_dataframe(repositories)
df = df.set_index("full_name").T

current_date = datetime.now().strftime("%Y-%m-%d")
df.index = [current_date]

# Load existing data if the file exists
file_path = Path(OUTPUT_PATH) / OUTPUT_FILENAME
if Path.exists(file_path, follow_symlinks=False):
existing_df = pd.read_csv(file_path, index_col=0)
df = pd.concat([existing_df, df])

save_to_csv(
df=df,
directory=OUTPUT_PATH,
filename=OUTPUT_FILENAME,
)


if __name__ == "__main__":
main()
8 changes: 7 additions & 1 deletion startrack/config.py
Original file line number Diff line number Diff line change
@@ -1 +1,7 @@
GITHUB_TOKEN_ENV = "GITHUB_TOKEN"
GITHUB_TOKEN_ENV = "GITHUB_TOKEN" # noqa: S105
INPUT_ORGANIZATIONS_ENV = "INPUT_ORGANIZATIONS"
INPUT_REPOSITORIES_ENV = "INPUT_REPOSITORIES"

INPUT_OUTPUT_PATH_ENV = "INPUT_OUTPUT_PATH"
INPUT_OUTPUT_FILENAME_ENV = "INPUT_OUTPUT_FILENAME"
HTTP_REQUEST_TIMEOUT = 30
122 changes: 97 additions & 25 deletions startrack/core.py
Original file line number Diff line number Diff line change
@@ -1,36 +1,45 @@
from dataclasses import dataclass
from enum import Enum
from typing import List, Dict, Any
from typing import Any

import pandas as pd
import requests

from startrack.config import HTTP_REQUEST_TIMEOUT


@dataclass
class RepositoryData:
"""
Data class for storing repository information.
"""Data class for storing repository information.
Attributes:
full_name (str): The name of the repository.
star_count (int): The number of stars the repository has.
fork_count (int): The number of forks the repository has.
"""

full_name: str
star_count: int
fork_count: int

@classmethod
def from_json(cls, data: Dict[str, Any]) -> 'RepositoryData':
full_name = data['full_name']
star_count = data['stargazers_count']
fork_count = data['forks_count']
def from_json(cls: type["RepositoryData"], data: dict[str, Any]) -> "RepositoryData":
"""Create a RepositoryData instance from a JSON dictionary.
Args:
data (Dict[str, Any]): A dictionary containing repository data.
Returns:
RepositoryData: An instance of RepositoryData.
"""
full_name = data["full_name"]
star_count = data["stargazers_count"]
fork_count = data["forks_count"]
return cls(full_name=full_name, star_count=star_count, fork_count=fork_count)


class RepositoryType(Enum):
"""
Enum for specifying types of repositories.
"""Enum for specifying types of repositories.
Attributes:
ALL: Represents all types of repositories.
@@ -49,14 +58,49 @@ class RepositoryType(Enum):
MEMBER = "member"


def list_organization_repositories(
def fetch_all_organization_repositories(
github_token: str,
organization_name: str,
repository_type: RepositoryType = RepositoryType.ALL,
page: int = 1
) -> List:
repository_type: RepositoryType,
) -> list:
"""Fetch all repositories of a specified organization.
Args:
github_token (str): The GitHub personal access token for authentication.
organization_name (str): The name of the GitHub organization whose repositories
are to be fetched.
repository_type (RepositoryType): The type of repositories to fetch.
Returns:
List: A list of repositories.
"""
Lists the repositories of a specified GitHub organization based on the repository
all_repositories = []
page = 1
with requests.Session() as session:
while True:
repos = fetch_organization_repositories_by_page(
session=session,
github_token=github_token,
organization_name=organization_name,
repository_type=repository_type,
page=page,
)
if not repos:
break
all_repositories.extend(repos)
page += 1

return all_repositories


def fetch_organization_repositories_by_page(
session: requests.Session,
github_token: str,
organization_name: str,
repository_type: RepositoryType = RepositoryType.ALL,
page: int = 1,
) -> list:
"""Lists the repositories of a specified GitHub organization based on the repository
type and page number.
Args:
@@ -70,38 +114,66 @@ def list_organization_repositories(
Returns:
List: A list containing details of the organization's repositories.
"""

headers = {
"Accept-Encoding": "gzip",
"Accept": "application/vnd.github+json",
"Authorization": f"Bearer {github_token}",
"X-GitHub-Api-Version": "2022-11-28"
"X-GitHub-Api-Version": "2022-11-28",
}

params = {
"type": repository_type.value,
"page": page
"page": page,
}

url = f"https://api.github.com/orgs/{organization_name}/repos"

response = requests.get(url, headers=headers, params=params)
response = session.get(
url, headers=headers, params=params, timeout=HTTP_REQUEST_TIMEOUT
)
return response.json()


def to_dataframe(repositories: List[RepositoryData]) -> pd.DataFrame:
"""
Convert a list of RepositoryData objects into a pandas DataFrame.
def convert_repositories_to_dataframe(
repositories: list[RepositoryData],
) -> pd.DataFrame:
"""Convert a list of RepositoryData objects into a pandas DataFrame.
Args:
Args:
repositories (List[RepositoryData]): A list of RepositoryData objects.
Returns:
pd.DataFrame: A DataFrame where each row represents a repository, with columns
for the repository's name and star count.
"""
data = [
{'full_name': repository.full_name, 'star_count': repository.star_count}
for repository
in repositories
{"full_name": repository.full_name, "star_count": repository.star_count}
for repository in repositories
]
return pd.DataFrame(data)


def fetch_repository_data_by_full_name(
github_token: str,
repository_full_name: str,
) -> dict[str, Any]:
"""Fetch data for a specific repository by its full name.
Args:
github_token (str): The GitHub personal access token for authentication.
repository_full_name (str): The full name of the repository.
Returns:
Dict[str, Any]: A dictionary containing repository data if the request is
successful, otherwise None.
"""
headers = {
"Accept-Encoding": "gzip",
"Accept": "application/vnd.github+json",
"Authorization": f"Bearer {github_token}",
}
url = f"https://api.github.com/repos/{repository_full_name}"
response = requests.get(url, headers=headers, timeout=HTTP_REQUEST_TIMEOUT)
if response.status_code == requests.codes.OK:
return response.json()
return None