Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Hotfix]change sql get vulnerabilities by package id #8540

Merged
merged 6 commits into from Apr 23, 2021
Merged

Conversation

lyndaidaii
Copy link
Contributor

@lyndaidaii lyndaidaii commented Apr 22, 2021

Improve the performance of quering vulnerabilities by package id

Addresses https://github.com/NuGet/Engineering/issues/3788#issuecomment-816126501

@lyndaidaii lyndaidaii marked this pull request as ready for review April 22, 2021 20:43
@lyndaidaii lyndaidaii requested a review from a team as a code owner April 22, 2021 20:43
@joelverhagen
Copy link
Member

How have you validated this?

When I prototyped one attempt before, I made the code change, captured the generated SQL via intellitrace event in VS, and then ran the SQL query against PROD USSC with IO statistics and query plan enabled.

Looks like UTs and Functional tests are also unhappy.

@lyndaidaii
Copy link
Contributor Author

lyndaidaii commented Apr 22, 2021

How have you validated this?

When I prototyped one attempt before, I made the code change, captured the generated SQL via intellitrace event in VS, and then ran the SQL query against PROD USSC with IO statistics and query plan enabled.

Looks like UTs and Functional tests are also unhappy.

I haven't validate yet. working on the changes on unit test. I think Drew is going to wake up soon. I will handle over to him since he is most familiar with vulnerabilities.

@lyndaidaii lyndaidaii changed the title change sql get vulunerabilities by package id change sql get vulnerabilities by package id Apr 23, 2021
@drewgillies
Copy link
Contributor

drewgillies commented Apr 23, 2021

This changes the following query:

exec sp_executesql N'SELECT 
    [Project1].[Key1] AS [Key], 
    [Project1].[Key] AS [Key1], 
    [Project1].[PackageRegistrationKey] AS [PackageRegistrationKey], 
    [Project1].[Copyright] AS [Copyright], 
    [Project1].[Created] AS [Created], 
    [Project1].[Description] AS [Description], 

...lots more columns...

    [Project1].[CertificateKey] AS [CertificateKey], 
    [Project1].[Id] AS [Id], 
    [Project1].[EmbeddedLicenseType] AS [EmbeddedLicenseType], 
    [Project1].[LicenseExpression] AS [LicenseExpression], 
    [Project1].[HasEmbeddedIcon] AS [HasEmbeddedIcon], 
    [Project1].[EmbeddedReadmeType] AS [EmbeddedReadmeType], 
    [Project1].[PackageDelete_Key] AS [PackageDelete_Key], 
    [Project1].[C1] AS [C1], 
    [Project1].[Key2] AS [Key2], 
    [Project1].[VulnerabilityKey] AS [VulnerabilityKey], 
    [Project1].[PackageId] AS [PackageId], 
    [Project1].[PackageVersionRange] AS [PackageVersionRange], 
    [Project1].[FirstPatchedPackageVersion] AS [FirstPatchedPackageVersion], 
    [Project1].[Key3] AS [Key3], 
    [Project1].[GitHubDatabaseKey] AS [GitHubDatabaseKey], 
    [Project1].[AdvisoryUrl] AS [AdvisoryUrl], 
    [Project1].[Severity] AS [Severity]
    FROM ( SELECT 
        [Extent1].[Key] AS [Key], 
        [Extent1].[PackageRegistrationKey] AS [PackageRegistrationKey], 
        [Extent1].[Copyright] AS [Copyright], 
        [Extent1].[Created] AS [Created], 
        [Extent1].[Description] AS [Description], 

...lots more columns...

        [Extent1].[PackageStatusKey] AS [PackageStatusKey], 
        [Extent1].[CertificateKey] AS [CertificateKey], 
        [Extent1].[Id] AS [Id], 
        [Extent1].[EmbeddedLicenseType] AS [EmbeddedLicenseType], 
        [Extent1].[LicenseExpression] AS [LicenseExpression], 
        [Extent1].[HasEmbeddedIcon] AS [HasEmbeddedIcon], 
        [Extent1].[EmbeddedReadmeType] AS [EmbeddedReadmeType], 
        [Extent1].[PackageDelete_Key] AS [PackageDelete_Key], 
        [Extent2].[Key] AS [Key1], 
        [Join3].[Key1] AS [Key2], 
        [Join3].[VulnerabilityKey] AS [VulnerabilityKey], 
        [Join3].[PackageId] AS [PackageId], 
        [Join3].[PackageVersionRange] AS [PackageVersionRange], 
        [Join3].[FirstPatchedPackageVersion] AS [FirstPatchedPackageVersion], 
        [Join3].[Key2] AS [Key3], 
        [Join3].[GitHubDatabaseKey] AS [GitHubDatabaseKey], 
        [Join3].[AdvisoryUrl] AS [AdvisoryUrl], 
        [Join3].[Severity] AS [Severity], 
        CASE WHEN ([Join3].[Key1] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1]
        FROM   [dbo].[Packages] AS [Extent1]
        INNER JOIN [dbo].[PackageRegistrations] AS [Extent2] ON [Extent1].[PackageRegistrationKey] = [Extent2].[Key]
        LEFT OUTER JOIN  (SELECT [Extent3].[Package_Key] AS [Package_Key], [Extent4].[Key] AS [Key1], [Extent4].[VulnerabilityKey] AS [VulnerabilityKey], [Extent4].[PackageId] AS [PackageId], [Extent4].[PackageVersionRange] AS [PackageVersionRange], [Extent4].[FirstPatchedPackageVersion] AS [FirstPatchedPackageVersion], [Extent5].[Key] AS [Key2], [Extent5].[GitHubDatabaseKey] AS [GitHubDatabaseKey], [Extent5].[AdvisoryUrl] AS [AdvisoryUrl], [Extent5].[Severity] AS [Severity]
            FROM   [dbo].[VulnerablePackageVersionRangePackages] AS [Extent3]
            INNER JOIN [dbo].[VulnerablePackageVersionRanges] AS [Extent4] ON [Extent3].[VulnerablePackageVersionRange_Key] = [Extent4].[Key]
            INNER JOIN [dbo].[PackageVulnerabilities] AS [Extent5] ON [Extent4].[VulnerabilityKey] = [Extent5].[Key] ) AS [Join3] ON [Extent1].[Key] = [Join3].[Package_Key]
        WHERE ([Extent2].[Id] = @p__linq__0) OR (1 = 0)
    )  AS [Project1]
    ORDER BY [Project1].[Key1] ASC, [Project1].[Key] ASC, [Project1].[C1] ASC',N'@p__linq__0 nvarchar(4000)',@p__linq__0=N'DotNetZip'

...to this:

exec sp_executesql N'SELECT 
    CASE WHEN ( EXISTS (SELECT 
        1 AS [C1]
        FROM   (SELECT [Extent1].[Key] AS [Key], [Extent1].[VulnerabilityKey] AS [VulnerabilityKey]
            FROM [dbo].[VulnerablePackageVersionRanges] AS [Extent1]
            WHERE [Extent1].[PackageId] = @p__linq__0 ) AS [Filter1]
        CROSS APPLY  (SELECT [Extent2].[VulnerablePackageVersionRange_Key] AS [VulnerablePackageVersionRange_Key]
            FROM  [dbo].[VulnerablePackageVersionRangePackages] AS [Extent2]
            LEFT OUTER JOIN  (SELECT 
                [Extent3].[Key] AS [Key]
                FROM [dbo].[PackageVulnerabilities] AS [Extent3]
                WHERE [Filter1].[VulnerabilityKey] = [Extent3].[Key] ) AS [Project1] ON 1 = 1
            WHERE [Filter1].[Key] = [Extent2].[VulnerablePackageVersionRange_Key] ) AS [Filter3]
    )) THEN cast(1 as bit) ELSE cast(0 as bit) END AS [C1]
    FROM  ( SELECT 1 AS X ) AS [SingleRowTable1]',N'@p__linq__0 nvarchar(4000)',@p__linq__0=N'DotNetZip'

i.e. a much simpler query that doesn't load Packages or PackageRegistrations.

@drewgillies
Copy link
Contributor

AI change on DotNetZip page in DEV. Average time prior to change : 5.6s, average time after change: 2.4s.

image

@@ -434,7 +433,7 @@ protected override void Load(ContainerBuilder builder)

builder.RegisterType<PackageVulnerabilitiesService>()
.As<IPackageVulnerabilitiesService>()
.InstancePerLifetimeScope();
.SingleInstance();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SingleInstance

PackageVulnerabilitiesService gets IEntityContext as input. The context must be instantiated at each call. Hence, PackageVulnerabilitiesService can't be a singleton

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For an example on how to implement caching and avoid this problem checkout the typosquating cache:
https://github.com/NuGet/NuGetGallery/blob/main/src/NuGetGallery/App_Start/DefaultDependenciesModule.cs#L410

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so sad--this will break caching.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverting the caching commit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The typosquating cache is pretty straightforward. Same approach can be reused. You can extract the relevant code to an abstract class..

@drewgillies
Copy link
Contributor

drewgillies commented Apr 23, 2021

Query metrics delta: (local)

TIME statistics pre-change:

SQL Server parse and compile time: 
   CPU time = 5 ms, elapsed time = 5 ms.

(1 row affected)

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

TIME statistics post-change:

SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 1 ms.

(1 row affected)

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

IO statistics pre-change:

(1 row affected)
Table 'PackageVulnerabilities'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'VulnerablePackageVersionRanges'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'VulnerablePackageVersionRangePackages'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'PackageRegistrations'. Scan count 0, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Packages'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

IO statistics post-change:

(1 row affected)
Table 'VulnerablePackageVersionRanges'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'VulnerablePackageVersionRangePackages'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

using System.Data.Entity;
using System.IO;
using System.Linq;
using System.Security.Cryptography;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this using required?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Thanks. Probably a leftover from the caching.

@joelverhagen
Copy link
Member

joelverhagen commented Apr 23, 2021

Adding TIME and IO statistics from PROD USSC.

BEFORE:

SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.
SQL Server parse and compile time: 
   CPU time = 16 ms, elapsed time = 16 ms.

(27 rows affected)
Table 'Worktable'. Scan count 1, logical reads 17773, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'VulnerablePackageVersionRangePackages'. Scan count 315, logical reads 716, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'PackageVulnerabilities'. Scan count 0, logical reads 630, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'VulnerablePackageVersionRanges'. Scan count 1, logical reads 8, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Packages'. Scan count 1, logical reads 112, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'PackageRegistrations'. Scan count 1, logical reads 3, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 62 ms,  elapsed time = 68 ms.

 SQL Server Execution Times:
   CPU time = 78 ms,  elapsed time = 85 ms.

AFTER:

SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.
SQL Server parse and compile time: 
   CPU time = 15 ms, elapsed time = 25 ms.

(14 rows affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'PackageVulnerabilities'. Scan count 0, logical reads 28, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'VulnerablePackageVersionRangePackages'. Scan count 2, logical reads 6, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'VulnerablePackageVersionRanges'. Scan count 2, logical reads 11, physical reads 1, page server reads 0, read-ahead reads 1, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 3 ms.

 SQL Server Execution Times:
   CPU time = 15 ms,  elapsed time = 29 ms.

Completion time: 2021-04-23T09:57:28.5301134-07:00

@lyndaidaii lyndaidaii merged commit a2ab1c7 into main Apr 23, 2021
@lyndaidaii lyndaidaii changed the title change sql get vulnerabilities by package id [Hotfix]change sql get vulnerabilities by package id Apr 23, 2021
@lyndaidaii lyndaidaii mentioned this pull request Apr 23, 2021
14 tasks
loic-sharma added a commit that referenced this pull request Apr 29, 2021
loic-sharma added a commit that referenced this pull request Apr 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants