Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track code stats #17

Open
thorwhalen opened this issue Nov 16, 2022 · 3 comments
Open

Track code stats #17

thorwhalen opened this issue Nov 16, 2022 · 3 comments

Comments

@thorwhalen
Copy link
Member

The objective is to be able to easily check current but also historical stats (such as code quality and complexity metrics) on the state of repositories.

Let's look for existing tools for this first. But if we needed to hack a minimal one for ourselves, here are some ideas:

  • CI updates a hidden "tracks" file (json, yaml, csv,...) with the new stats on every CI run (maybe only successful ones)
  • Alternative: Uses an API to tell a service to dump the stats in an actual DB. Probably Elastic, since has Kibana to view graphs etc. Easier to directly do analysis if we don't have to fetch files from each of the repos?

I've written a few things in umpyre, so you can accumulate/aggregate further stuff there.
But if possible, let's use others' packages instead of writing our own: Still, we can develop our own functional interface to these tools in umpyre.

What, and with what to track?

This should, of course, be parametrizable, because we will no doubt want to add and remove trackers.
Implementation-wise, this points to an architecture where we can specify the what in a config file that is separate from the code that uses the config file.

Proposals for tools to use are welcome!

Here are a few ideas:

  • radon computes Halstead and "Maintainability index" (and two others).
  • coverage
@zfeng10
Copy link
Contributor

zfeng10 commented Dec 15, 2022

Tools Ziyan’s looked into:

  • wily (Recommended)

    • examines the current complexity of the project or of any of the individual Python files
    • can be used in a CI/CD workflow to compare the complexity of the current files against a particular commit
    • can be used with pre-commit hooks
    • visualize change of complexity over time with table and graph
    • some commands:
      • wily build: iterate through the Git history and analyze the metrics for each file
      • wily report: see the historical trend in metrics for a given file or folder
      • wily graph: graph a set of metrics in an HTML file
      • wily diff: compares the last indexed data with the current working copy of a file
    • will detect and scan all Python code in .ipynb files automatically
    • uses radon to calculate metrics
  • coverage (Recommended used together with wily)

    • finds the lines of code that are missing tests
    • can easily run on top of pytest
  • radon

    • computes:
      • cc: compute Cyclomatic Complexity
      • mi: compute Maintainability Index
      • hal: compute Halstead complexity metrics
      • raw: compute raw metrics
        • LOC: the total number of lines of code
        • LLOC: the number of logical lines of code
        • SLOC: the number of source lines of code - not necessarily corresponding to the LLOC [[Wikipedia]](https://radon.readthedocs.io/en/latest/commandline.html#wikipedia)
        • comments: the number of Python comment lines (i.e. only single-line comments #)
        • multi: the number of lines representing multi-line strings
        • blank: the number of blank lines (or whitespace-only ones)
    • works with Jupyter Notebook
  • mccabe - checks for McCabe code complexity (The number of independent code paths present. Want score ≤ 10

    • 1–10: low risk, simple program;
    • 11–20: moderate risk, more difficult program;
    • 21–50: high risk, very difficult program;
    • > 50: very high risk, untestable program.
    • flake8 wraps around
  • [complexity](https://github.com/thoughtbot/complexity) - a command line tool calculates an approximation of code complexity per file in a language-agnostic way

    • not very intuitive interpretation of the values of the metrics

@thorwhalen
Copy link
Member Author

Seems like wily+coverage is the way to go. @valentin-feron what do you think?

@valentin-feron
Copy link
Member

Yes I agree. I would even say pylint+wily+coverage (code stylying + complexity tracking + test coverage)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants