Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chart by package version #132

Open
MarkPflug opened this issue Jan 14, 2021 · 6 comments
Open

chart by package version #132

MarkPflug opened this issue Jan 14, 2021 · 6 comments
Labels
enhancement New feature or request

Comments

@MarkPflug
Copy link

It would be nice if the graph could be narrowed down to a specific version of the package. Maybe even a stacked-bar chart with version. If this was paired with an x-axis overlay of version releases (via a colored vertical line) it would help visualize how quickly new versions are adopted and how much old versions are still in use.

@bruno-garcia
Copy link
Member

That would be nice. We would need to collect download numbers per version though, which we don't.

@bruno-garcia bruno-garcia added the enhancement New feature or request label Jan 15, 2021
@MarkPflug
Copy link
Author

Oh, I see. I was assuming that since nuget.org shows the per-version download numbers that you'd have access to that data as well.

@bruno-garcia
Copy link
Member

@MarkPflug it might be available by one of nuget.org's API but right now we hit a single package (not a version of it) per day once, and get the total number (across all versions). So we need to change that. That said unless we can fetch the whole thing with a single hit to their API, it likely will need some redesign on the job, it takes 1 or 2 hours to go through the 220000+ packages right now.

Probably a good chance to simplify the backend.

@loic-sharma
Copy link

You should be able to get the downloads by version using your current approach. The search response contains a breakdown of downloads by version. For example: https://azuresearch-usnc.nuget.org/query?q=packageid:Newtonsoft.Json&take=1

{
    ...
    "totalHits": 1,
    "data": [
        {
            ...
            "id": "Newtonsoft.Json",
            "version": "12.0.3",
            "totalDownloads": 824781418,
            ...
            "versions": [
                {
                    "version": "3.5.8",
                    "downloads": 586170,
                    "@id": "https://api.nuget.org/v3/registration5-semver1/newtonsoft.json/3.5.8.json"
                },
                ...
                {
                    "version": "12.0.3",
                    "downloads": 83014646,
                    "@id": "https://api.nuget.org/v3/registration5-semver1/newtonsoft.json/12.0.3.json"
                }
            ]
        }
    ]
}

@loic-sharma
Copy link

loic-sharma commented Jan 17, 2021

You should be able to get the downloads by version by calling packageMetadata.GetVersionsAsync() here:

context.DailyDownloads.Add(new DailyDownload
{
PackageId = packageMetadata.Identity.Id,
Date = DateTime.UtcNow.Date,
DownloadCount = packageMetadata.DownloadCount
});

FYI, the method is async but it doesn't do anything expensive like additional web requests when using the V3 protocol (see this).

P.S. Nice CSV library @MarkPflug :)

@bruno-garcia
Copy link
Member

bruno-garcia commented Jan 18, 2021

Thanks for the pointers @loic-sharma. The only question left is: Do we want to do that in the current architecture? I wonder how much more data per day we'll be dumping into pgsql. @clairernovotny mentioned the foundation can host the site on Azure so maybe we can use blob storage to dump these numbers given they are immutable, or some other strategy. We can probably get rid of rabbitmq too which is used only to queue the batch of ids to hit nuget.org. Some other way to have reentrancy would be needed so we can restart the job not having to start from the beginning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants