Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🚀 We are going to redesign the trending algorithm #778

Closed
Icemap opened this issue Sep 2, 2022 · 6 comments
Closed

🚀 We are going to redesign the trending algorithm #778

Icemap opened this issue Sep 2, 2022 · 6 comments
Labels
area/trending help wanted Extra attention is needed

Comments

@Icemap
Copy link
Member

Icemap commented Sep 2, 2022

The developers of OSS Insight are loyal users of GitHub Trending. When we heard that GitHub was deprecating its Trending page, we decided to optimize OSS Insight trending to become a GitHub Trending alternative.

As we all know, most of the repos appearing on GitHub Trending are worthy of attention, but a few repos can also appear on the page by taking advantage of the trending algorithm. So the trending algorithm is very important. We are going to design a new algorithm that will be able to find the most popular repos, but also prevent some projects from getting onto the trending page through cheating.

Currently, we can provide these metrics, including GitHub interface interaction metrics like:

  • Star
  • Fork

and code collaboration metrics like:

  • Pull Request
  • Issue Open
  • Issue Close
  • Additions Code Lines
  • Deletions Code Lines

How should we set the weights of these metrics? Anybody got any ideas? Welcome to join us to discuss!

@Icemap Icemap changed the title Improve Trending Repos We are going to redesign the trending algorithm Sep 6, 2022
@Icemap
Copy link
Member Author

Icemap commented Sep 6, 2022

I have a preliminary idea. Make a time sink algorithm using the number of Stars and the number of Forks. That is, set the upper and lower score limits, the longer the operation is from the current time, the lower the score until the lower score limit. According to this algorithm, count the scores in a certain time period of all repos and rank them. Then you can get the trending repos during this period.

@Mini256 Mini256 pinned this issue Sep 8, 2022
@Mini256 Mini256 changed the title We are going to redesign the trending algorithm 🚀 We are going to redesign the trending algorithm Sep 8, 2022
@Mini256 Mini256 added area/trending help wanted Extra attention is needed labels Sep 8, 2022
@sykp241095 sykp241095 unpinned this issue Sep 26, 2022
@guoqiangqi
Copy link

I have a preliminary idea. Make a time sink algorithm using the number of Stars and the number of Forks. That is, set the upper and lower score limits, the longer the operation is from the current time, the lower the score until the lower score limit. According to this algorithm, count the scores in a certain time period of all repos and rank them. Then you can get the trending repos during this period.

Hi @Icemap , im so interesting in the OSS Insight trending algorithm you noticed and used by ossinsight.io website, can you show some details with formula or codes here? Really appreciate it!

@Icemap
Copy link
Member Author

Icemap commented Jan 17, 2023

@guoqiangqi Sure. I'm very glad to help you.
We just use one SQL to achieve it. It's quite simple in TiDB. Because TiDB is an HTAP database. So we can just use SQL to make the OLAP workflow.
And this is the SQL file. If you have any questions, please feel free to comment here.

@guoqiangqi
Copy link

@guoqiangqi Sure. I'm very glad to help you. We just use one SQL to achieve it. It's quite simple in TiDB. Because TiDB is an HTAP database. So we can just use SQL to make the OLAP workflow. And this is the SQL file. If you have any questions, please feel free to comment here.

@Icemap Get it, thanks you so much.

@zpointS
Copy link

zpointS commented Jan 3, 2024

Hi @Icemap, I'm also interested in the design of trending algorithm, and I've found that the aforementioned SQL file has been moved away (or deprecated), so I wonder is there any other way that you can tell us about the formula or codes? Again really appreciate it!

@Icemap
Copy link
Member Author

Icemap commented Jan 3, 2024

Hi @zpointS. Thanks for the like. We moved this SQL to here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/trending help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

5 participants