Skip to content
This repository has been archived by the owner on Mar 8, 2023. It is now read-only.

Recently updated artefacts out of date #9

Closed
marianfoo opened this issue Oct 4, 2021 · 10 comments
Closed

Recently updated artefacts out of date #9

marianfoo opened this issue Oct 4, 2021 · 10 comments
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed

Comments

@marianfoo
Copy link
Contributor

The page 'Recently updated artefacts' shows all artefacts that have been updated recently.
However, since the update process is not triggered often enough, the data is increasingly out of date.
Is it possible to start this process more often, or to move it to a separate repo like dotabap.org and thus keep the 'trends.json' and 'allItems.json' up to date?

@IObert
Copy link
Contributor

IObert commented Oct 4, 2021

Hi,

sounds like a fair comment but I'm not sure what's the best way to deal with this situation. I guess there are multiple options such as:

  • Running on a weekly basis (vs once per month), but this is something our team internally decided against to have a more robust trend. That doesn't mean this decision is set in stone but I would like to hear what the community thinks about this
  • Changing the criteria of "recently updated" to one week prior to the last run. This would at least mean, that artifacts in the list cannot be older than 5 weeks
  • Define a new GH action that only pulls the last update date of the items listed here that removes old items from the ranking. Which could mean that this particular ranking changes every day/week but the other rankings only change in the following month. This would mean that some items might be removed from the ranking over the course of the month without adding new items.

Does any of these ideas cover what you mean with dotabap.org? If not, can you please explain in more detail what you had in mind?

@IObert IObert self-assigned this Oct 4, 2021
@marianfoo
Copy link
Contributor Author

marianfoo commented Oct 4, 2021

Hi,

I don't mean the calculation of the trend data or the ranking.
I am actually only interested in the display of the respective information (e.g. recently updated or stars)
and the resulting order in "recently updated".
If this information is only retrieved once a month, the data that is displayed is up to 4 weeks out of date.
As an example: I use dotabap.org to see if there are new repos in the list and which repos have been updated.
Since the list is always sorted by "recently updated", I have this information at a glance.
The list is also updated once a day.

I would then wish for the following:
Displayed information (last update, stars, forks, pulls) is retrieved separately and more often (ideally once a day).
Continue to calculate trend data as desired (monthly, weekly).
More sorting:
e.g. for "recently updated" a filter according to "recently updated".
for "new", sort by "creation date".
possibly also sorting by stars, forks, pulls
I also only recently realised that all three pages (trends,new,updated) are sorted by ranking, which is not so clear.

"New Artifacts" is also a bit misleading, I think "Newly added Artifacts" makes it a bit clearer.

@IObert
Copy link
Contributor

IObert commented Oct 5, 2021

All good points!
I could see a daily job that updates the items. One problem here might be the BigQuery API Key that doesn't allow such a frequent access to the table. All other providers should remain within the free quota even when we update the stats daily.

I don't think I'll have time for this in the next few weeks as Devtoberfest and TechEd keep me quite busy. But I'll keep the issue open in case someone wants to tackle this.

@IObert IObert removed their assignment Oct 5, 2021
@IObert IObert added enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed labels Oct 5, 2021
@marianfoo
Copy link
Contributor Author

Hi @IObert
the new GitHub action seems to have worked.
But unfortunately the rebuild was not triggered.
Should the rebuild be done in the gh-update action or should the new data also be written directly to the docs folder?

@IObert
Copy link
Contributor

IObert commented Nov 24, 2021

I think it makes sense to write directly into docs at the end of the gh-update action

@marianfoo
Copy link
Contributor Author

The commit today looks good.

All good points! I could see a daily job that updates the items. One problem here might be the BigQuery API Key that doesn't allow such a frequent access to the table. All other providers should remain within the free quota even when we update the stats daily.

Does it make sense to expand this action to the other providers?
Should PyPi be excluded from this?

@IObert
Copy link
Contributor

IObert commented Nov 26, 2021

I think it makes sense to extend this for npm. I'd exclude PiPy for now was it uses a paid API key.

@IObert
Copy link
Contributor

IObert commented Nov 26, 2021

And yes, it looks great! Many thanks for your great work 🤩!

@marianfoo
Copy link
Contributor Author

No worries :)
I think with #18 we can close this issue as it is resolved, right?

@IObert
Copy link
Contributor

IObert commented Nov 29, 2021

Yes :)

@IObert IObert closed this as completed Nov 29, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants