diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 97198fd1c8..64d2da8f5b 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -24,20 +24,16 @@ $ git remote add upstream https://github.com/chaoss/augur.git ```bash $ git checkout -b my-new-branch ``` -4. Switch between branches -```bash -$ git checkout branch-name -``` -5. Make your change(s). +4. Make your change(s). -6. Commit the change(s) and push to your fork +5. Commit the change(s) and push to your fork ```bash $ git add . $ git commit -s -m "This is my first commit" $ git push -u origin my-new-branch ``` -7. Then, [submit a pull request](https://github.com/chaoss/augur/compare). +6. Then, [submit a pull request](https://github.com/chaoss/augur/compare). At this point, you're waiting on us. We like to at least comment on pull requests within three business days (and, typically, one business day). Once one of our maintainers has had a chance to review your PR, we will either mark it as "needs review" and provide specific feedback on your changes, or we will go ahead and complete the pull request. diff --git a/README.md b/README.md index 741f22f49a..ca9e57bc18 100644 --- a/README.md +++ b/README.md @@ -9,6 +9,17 @@ [![CII Best Practices](https://bestpractices.coreinfrastructure.org/projects/2788/badge)](https://bestpractices.coreinfrastructure.org/projects/2788) +## NEW BETA RELEASE ALERT! +Augur released a beta of its new version, which is built from the augur-new branch, here: https://github.com/chaoss/augur/releases/tag/v0.42.0 +- The augur-new branch is a stable version of our new architecture, which features: + - Dramatic improvement in the speed of large scale (10,000+ repos). All data is obtained for 10k+ repos within a week + - A new job management architecture that uses Celery and Redis to manage queues, and enables users to run a Flower job monitoring dashboard + - Materialized views to increase the snappiness of API’s and Frontends on large scale data + - Changes to primary keys, which now employ a UUID strategy that ensures unique keys across all Augur instances + - Support for https://github.com/chaoss/sandiego-rh dashboards (view a sample here: https://eightknot.osci.io/). (beautification coming soon!) + - Data collection completeness assurance enabled by a structured, relational data set that is easily compared with platform API Endpoints +- The next release of the new version will include a hosted version of Augur where anyone can create an account and add repos “they care about”. If the hosted instance already has a requested organization or repository it will be added to a user’s view. If its a new repository or organization, the user will be notified that collection will take (time required for the scale of repositories added). (edited) + ## What is Augur? Augur is a software suite for collecting and measuring structured data diff --git a/docs/source/development-guide/images/Aa.jpeg b/docs/source/development-guide/images/Aa.jpeg deleted file mode 100644 index a4bed75dca..0000000000 Binary files a/docs/source/development-guide/images/Aa.jpeg and /dev/null differ diff --git a/docs/source/development-guide/images/Aa.png b/docs/source/development-guide/images/Aa.png index 6351164a6d..0a53a71284 100644 Binary files a/docs/source/development-guide/images/Aa.png and b/docs/source/development-guide/images/Aa.png differ diff --git a/docs/source/development-guide/images/Ab.jpeg b/docs/source/development-guide/images/Ab.jpeg deleted file mode 100644 index ea0a75a6ed..0000000000 Binary files a/docs/source/development-guide/images/Ab.jpeg and /dev/null differ diff --git a/docs/source/development-guide/images/Ab.png b/docs/source/development-guide/images/Ab.png index cb863e29d0..90d0407f79 100644 Binary files a/docs/source/development-guide/images/Ab.png and b/docs/source/development-guide/images/Ab.png differ diff --git a/docs/source/development-guide/images/Ac.jpeg b/docs/source/development-guide/images/Ac.jpeg deleted file mode 100644 index fb2a95cf4f..0000000000 Binary files a/docs/source/development-guide/images/Ac.jpeg and /dev/null differ diff --git a/docs/source/development-guide/images/Ac.png b/docs/source/development-guide/images/Ac.png index 24fbf98075..03e1559ab6 100644 Binary files a/docs/source/development-guide/images/Ac.png and b/docs/source/development-guide/images/Ac.png differ diff --git a/docs/source/development-guide/images/Ad.jpeg b/docs/source/development-guide/images/Ad.jpeg deleted file mode 100644 index 31a785162b..0000000000 Binary files a/docs/source/development-guide/images/Ad.jpeg and /dev/null differ diff --git a/docs/source/development-guide/images/Ad.png b/docs/source/development-guide/images/Ad.png index 69d6493e82..5d39c9de0b 100644 Binary files a/docs/source/development-guide/images/Ad.png and b/docs/source/development-guide/images/Ad.png differ diff --git a/docs/source/development-guide/images/Ae .png b/docs/source/development-guide/images/Ae .png index 44e69c0916..606bf5c87e 100644 Binary files a/docs/source/development-guide/images/Ae .png and b/docs/source/development-guide/images/Ae .png differ diff --git a/docs/source/development-guide/images/Ae.jpeg b/docs/source/development-guide/images/Ae.jpeg deleted file mode 100644 index 4b8acbbad5..0000000000 Binary files a/docs/source/development-guide/images/Ae.jpeg and /dev/null differ diff --git a/docs/source/development-guide/images/Af.jpeg b/docs/source/development-guide/images/Af.jpeg deleted file mode 100644 index 34229c4667..0000000000 Binary files a/docs/source/development-guide/images/Af.jpeg and /dev/null differ diff --git a/docs/source/development-guide/images/Af.png b/docs/source/development-guide/images/Af.png index 5fa8ccdfd9..c96d39d5ff 100644 Binary files a/docs/source/development-guide/images/Af.png and b/docs/source/development-guide/images/Af.png differ diff --git a/docs/source/getting-started/Welcome.rst b/docs/source/getting-started/Welcome.rst index 9eb84694c3..c26b871ca0 100644 --- a/docs/source/getting-started/Welcome.rst +++ b/docs/source/getting-started/Welcome.rst @@ -27,7 +27,7 @@ Example: Hi, my name is Precious Onyewuchi. I’m an outreachy intern contributi This is a picture of my own introduction. Some parts of the image were blurred for privacy reasons 3) As a follow-up for number two (2), if you’re contributing from a program (e.g Outreachy, or GSOC), find the respective channel for your program and join it. It’d be easier to get questions regarding that program answered and you can see other people like you in the same program to interact with and get help from. The channel for GSOC is #gsoc, and the channel for Outreachy is #outreachy, #general is for general questions and #augur is for stuff designated to augur, (e.g Installation (i figured that last part a little later. So, you’re welcome!) - .. image:: images/channels.png + .. image:: images/channels.PNG :width: 400 :alt: "Slack image" diff --git a/docs/source/getting-started/images/chaoss_slack_workspace.png b/docs/source/getting-started/images/chaoss_slack_workspace.png new file mode 100644 index 0000000000..79c1d72b84 Binary files /dev/null and b/docs/source/getting-started/images/chaoss_slack_workspace.png differ diff --git a/docs/source/schema/regularly_used_data.rst b/docs/source/schema/regularly_used_data.rst index 1b21835b85..5eb9208e2d 100644 --- a/docs/source/schema/regularly_used_data.rst +++ b/docs/source/schema/regularly_used_data.rst @@ -1,16 +1,16 @@ List of Regularly Used Data Tables In Augur =========================================== -** This is a list of data tables in augur that are regularly used and the various workers attached to them. ** +**This is a list of data tables in augur that are regularly used and the various workers attached to them.** - * Commits - This is where a record for every file in every commit in every repository in an Augur instance is kept. + **Commits** - This is where a record for every file in every commit in every repository in an Augur instance is kept. * Worker: Facade worker collects, and also stores platform user information in the commits table. .. image:: images/commits.png :width: 200 - * Contributor_affiliations: A list of emails and domains, with start and end dates for individuals to have an organizational affiliation. + **Contributor_affiliations** : A list of emails and domains, with start and end dates for individuals to have an organizational affiliation. * Populated by default when augur is installed * Can be edited so that an Augur instance can resolve a larger list of affiliations. @@ -19,15 +19,15 @@ List of Regularly Used Data Tables In Augur .. image:: images/contributor_affiliations.png :width: 200 - * Contributor_repo - Storage of a snowball sample of all the repositories anyone in your schema has accessed on GitHub. So, for example, if you wanted to know all the repositories that people on your project contributed to, this would be the table. + **Contributor_repo** - Storage of a snowball sample of all the repositories anyone in your schema has accessed on GitHub. So, for example, if you wanted to know all the repositories that people on your project contributed to, this would be the table. - * Contributor_breadth_worker populates this table + * *Contributor_breadth_worker* populates this table * Population of this table happens last, and can take a long time. .. image:: images/contributor_repo.png :width: 200 - * Contributors - These are all the contributors to a project/repo. In Augur, all types of contributions create a contributor record. This includes issue comments, pull request comments, label addition, etc. This is different than how GitHub counts contributors; they only include committers. + **Contributors** - These are all the contributors to a project/repo. In Augur, all types of contributions create a contributor record. This includes issue comments, pull request comments, label addition, etc. This is different than how GitHub counts contributors; they only include committers. * Workers Adding Contributors: @@ -40,7 +40,7 @@ List of Regularly Used Data Tables In Augur .. image:: images/contributors.png :width: 200 - * Contributors_aliases - These are all the alternate emails that the same contributor might use. These records arise almost entirely from the commit log. For example, if I have two different emails on two different computers that I use when I make a commit, then an alias is created for whatever the 2nd to nth email Augur runs across. If a user’s email cannot be resolved, it is placed in the unresolved_commit_emails table. Coverage is greater than 98% since Augur 1.2.4. + **Contributors_aliases** - These are all the alternate emails that the same contributor might use. These records arise almost entirely from the commit log. For example, if I have two different emails on two different computers that I use when I make a commit, then an alias is created for whatever the 2nd to nth email Augur runs across. If a user’s email cannot be resolved, it is placed in the unresolved_commit_emails table. Coverage is greater than 98% since Augur 1.2.4. * Worker: @@ -49,7 +49,7 @@ List of Regularly Used Data Tables In Augur .. image:: images/contributors_aliases.png :width: 200 - * Discourse_insights - There are nine specific discourse act types identified by the computational linguistic algorithm that underlies the discourse insights worker. This worker analyzes each comment on each issue or pull request sequentially so that context is applied when determining the discourse act type. These types are: + **Discourse_insights** - There are nine specific discourse act types identified by the computational linguistic algorithm that underlies the discourse insights worker. This worker analyzes each comment on each issue or pull request sequentially so that context is applied when determining the discourse act type. These types are: * negative-reaction * answer @@ -68,7 +68,7 @@ List of Regularly Used Data Tables In Augur .. image:: images/discourse_insights.png :width: 200 - * issue_assignees || issue_events || issue_labels + **issue_assignees || issue_events || issue_labels** * Worker: @@ -77,7 +77,7 @@ List of Regularly Used Data Tables In Augur .. image:: images/issue_assignees.png :width: 200 - * issue_message_ref - A link between the issue and each message stored in the message table. + **issue_message_ref** - A link between the issue and each message stored in the message table. * Worker: @@ -86,7 +86,7 @@ List of Regularly Used Data Tables In Augur .. image:: images/issue_message_ref.png :width: 200 - * issues - Is all the data related to a GitHub Issue. + **issues** - Is all the data related to a GitHub Issue. * Worker: @@ -95,12 +95,12 @@ List of Regularly Used Data Tables In Augur .. image:: images/issues.png :width: 200 - * Message - every pull request or issue related message. These are then mapped back to either pull requests, or issues, using the __msg_ref tables + **Message** - every pull request or issue related message. These are then mapped back to either pull requests, or issues, using the __msg_ref tables .. image:: images/message.png :width: 200 - * Message_analysis: Two factors evaluated for every pull request on issues message: What is the sentiment of the message (positive or negative), and what is the novelty of the message in the context of other messages in that repository. + **Message_analysis:** Two factors evaluated for every pull request on issues message: What is the sentiment of the message (positive or negative), and what is the novelty of the message in the context of other messages in that repository. * Worker: @@ -109,7 +109,7 @@ List of Regularly Used Data Tables In Augur .. image:: images/message_analysis.png :width: 200 - * Message_analysis_summary: A summary level representation of the granular data in message_analysis. + **Message_analysis_summary:** A summary level representation of the granular data in message_analysis. * Worker: @@ -118,7 +118,7 @@ List of Regularly Used Data Tables In Augur .. image:: images/message_analysis_summary.png :width: 200 - * Platform: Reference data with two rows: one for GitHub, one for GitLab. + **Platform:** Reference data with two rows: one for GitHub, one for GitLab. * Worker: @@ -127,7 +127,7 @@ List of Regularly Used Data Tables In Augur .. image:: images/platform.png :width: 200 - * Pull_request_analysis: A representation of the probability of a pull request being merged into a repository, based on analysis of the properties of previously merged pull requests in a repository. (Machine learning worker) + **Pull_request_analysis:** A representation of the probability of a pull request being merged into a repository, based on analysis of the properties of previously merged pull requests in a repository. (Machine learning worker) * Worker: @@ -136,7 +136,7 @@ List of Regularly Used Data Tables In Augur .. image:: images/pull_request_analysis.png :width: 200 - * pull_request_assignees || pull_request_commits || pull_request_events || pull_request_files || pull_request_labels || pull_request_message_ref - All the data related to pull requests. Every pull request will be in the pull_requests data. + **pull_request_assignees || pull_request_commits || pull_request_events || pull_request_files || pull_request_labels || pull_request_message_ref** - All the data related to pull requests. Every pull request will be in the pull_requests data. .. image:: images/pull_request_assignees.png :width: 200 @@ -156,7 +156,7 @@ List of Regularly Used Data Tables In Augur .. image:: images/pull_request_ref.png :width: 200 - * pull_request_meta || pull_request_repo || pull_request_review_message_ref || pull_request_reviewers || pull_request_reviews || pull_request_teams || pull_requests - All the data related to pull requests. Every pull request will be in the pull_requests data. + **pull_request_meta || pull_request_repo || pull_request_review_message_ref || pull_request_reviewers || pull_request_reviews || pull_request_teams || pull_requests** - All the data related to pull requests. Every pull request will be in the pull_requests data. .. image:: images/pull_request.png :width: 200 @@ -179,7 +179,7 @@ List of Regularly Used Data Tables In Augur .. image:: images/pull_request_teams.png :width: 200 - * Releases: Github declared software releases or release tags. For example: https://github.com/chaoss/augur/releases + **Releases:** Github declared software releases or release tags. For example: https://github.com/chaoss/augur/releases * Worker: @@ -188,12 +188,12 @@ List of Regularly Used Data Tables In Augur .. image:: images/releases.png :width: 200 - * Repo: A list of all the repositories. + **Repo:** A list of all the repositories. .. image:: images/repo.png :width: 200 - * Repo_badging: A list of CNCF badging information for a project. Reads this api endpoint: https://bestpractices.coreinfrastructure.org/projects.json + **Repo_badging:** A list of CNCF badging information for a project. Reads this api endpoint: https://bestpractices.coreinfrastructure.org/projects.json * Worker: @@ -202,7 +202,7 @@ List of Regularly Used Data Tables In Augur .. image:: images/repo_badging.png :width: 200 - * Repo_cluster_messages: Identifying which messages and repositories are clustered together. Identifies project similarity based on communication patterns. + **Repo_cluster_messages:** Identifying which messages and repositories are clustered together. Identifies project similarity based on communication patterns. * Worker: @@ -211,7 +211,7 @@ List of Regularly Used Data Tables In Augur .. image:: images/repo_cluster_messages.png :width: 200 - * Repo_dependencies: enumerates every dependency, including dependencies that are not package managed. + **Repo_dependencies:** enumerates every dependency, including dependencies that are not package managed. * Worker: @@ -220,7 +220,7 @@ List of Regularly Used Data Tables In Augur .. image:: images/repo_dependencies.png :width: 200 - * Repo_deps_libyear: (enumerates every package managed dependency) Looks up the latest release of any library that is imported into a project. Then it compares that release date, the release version of the library version in your project (and its release date), and calculates how old your version is, compared to the latest version. The resulting statistic is “libyear”. This worker runs at least once a month, so over time, you will see if your libraries are being kept up to date, or not. + **Repo_deps_libyear:** (enumerates every package managed dependency) Looks up the latest release of any library that is imported into a project. Then it compares that release date, the release version of the library version in your project (and its release date), and calculates how old your version is, compared to the latest version. The resulting statistic is “libyear”. This worker runs at least once a month, so over time, you will see if your libraries are being kept up to date, or not. * Scenarios: * If a library is updated, but you didn’t change your version, the libyear statistic gets larger @@ -233,7 +233,7 @@ List of Regularly Used Data Tables In Augur .. image:: images/repo_deps_libyear.png :width: 200 - * Repo_deps_scorecard: Runs the OSSF Scorecard over every repository ( https://github.com/ossf/scorecard ) : There are 16 factors that are explained at that repository location. + **Repo_deps_scorecard:** Runs the OSSF Scorecard over every repository ( https://github.com/ossf/scorecard ) : There are 16 factors that are explained at that repository location. * Worker: @@ -242,12 +242,12 @@ List of Regularly Used Data Tables In Augur .. image:: images/repo_deps_scorecard.png :width: 200 - * Repo_groups: reference data. The repo groups in an augur instance. + **Repo_groups:** reference data. The repo groups in an augur instance. .. image:: images/repo_groups.png :width: 200 - * Repo_info: this worker gathers metadata from the platform API that includes things like “number of stars”, “number of forks”, etc. AND it also gives us : Number of issues, number of pull requests, etc. .. THAT information we use to determine if we have collected all of the PRs and Issues associated with a repository. + **Repo_info:** this worker gathers metadata from the platform API that includes things like “number of stars”, “number of forks”, etc. AND it also gives us : Number of issues, number of pull requests, etc. .. THAT information we use to determine if we have collected all of the PRs and Issues associated with a repository. * Worker: @@ -256,7 +256,7 @@ List of Regularly Used Data Tables In Augur .. image:: images/repo_info.png :width: 200 - * Repo_insights: + **Repo_insights:** * Worker: @@ -265,7 +265,7 @@ List of Regularly Used Data Tables In Augur .. image:: images/repo_insights.png :width: 200 - * Repo_insights_records: + **Repo_insights_records:** * Worker: @@ -274,7 +274,7 @@ List of Regularly Used Data Tables In Augur .. image:: images/repo_insights_records.png :width: 200 - * Repo_labor + **Repo_labor** * Worker: @@ -283,22 +283,22 @@ List of Regularly Used Data Tables In Augur .. image:: images/repo_labor.png :width: 200 - * Repo_meta: Exists to capture repo data that may be useful in the future. Not currently populated. + **Repo_meta:** Exists to capture repo data that may be useful in the future. Not currently populated. .. image:: images/repo_meta.png :width: 200 - * Repo_sbom_scans: This table links the augur_data schema to the augur_spdx schema to keep a list of repositories that need licenses scanned. (These are for file level license declarations, which are common in Linux Foundation projects, but otherwise not in wide use). + **Repo_sbom_scans:** This table links the augur_data schema to the augur_spdx schema to keep a list of repositories that need licenses scanned. (These are for file level license declarations, which are common in Linux Foundation projects, but otherwise not in wide use). .. image:: images/repo_sbom_scans.png :width: 200 - * Repo_stats: Exists to capture repo data that may be useful in the future. Not currently populated. + **Repo_stats:** Exists to capture repo data that may be useful in the future. Not currently populated. .. image:: images/repo_stats.png :width: 200 - * Repo_topic: Identifies probable topics of conversation in discussion threads around issues and pull requests. + **Repo_topic:** Identifies probable topics of conversation in discussion threads around issues and pull requests. * Worker: @@ -307,7 +307,7 @@ List of Regularly Used Data Tables In Augur .. image:: images/repo_topic.png :width: 200 - * Topic_words: Unigrams, bigrams, and trigrams associated with topics in the repo_topic table. + **Topic_words:** Unigrams, bigrams, and trigrams associated with topics in the repo_topic table. * Worker: @@ -316,7 +316,7 @@ List of Regularly Used Data Tables In Augur .. image:: images/topic_words.png :width: 200 - * Unresolved_commit_emails - emails from commits that were not initially able to be resolved using automated mechanisms. + **Unresolved_commit_emails** - emails from commits that were not initially able to be resolved using automated mechanisms. * Worker: