Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading large documents takes a long time #8076

Open
droyad opened this issue Mar 8, 2023 · 2 comments
Open

Reading large documents takes a long time #8076

droyad opened this issue Mar 8, 2023 · 2 comments

Comments

@droyad
Copy link
Contributor

droyad commented Mar 8, 2023

When reading documents where the JSON is large, 4MB takes about 22 seconds. The time increases quickly from there to hours for 40MB.

It does seem that lots of node CPU is used as well.

This is due to this bug.

The code paths that use CommandBehavior.SequentialAccess or the load syncronously are not affected, but those that don't could take hours to load for sufficiently large datasets.

We do not use CommandBehavior.SequentialAccess in our ToListAsync methods as that uses Nevermore.Advanced.Queryable.QueryTranslator which does not set that value. Selecting by ID is not affected.

As far as I can see we should use SequentialAccess in all cases.

Provides a way for the DataReader to handle rows that contain columns with large binary values. Rather than loading the entire row, SequentialAccess enables the DataReader to load data as a stream. You can then use the GetBytes or GetChars method to specify a byte location to start the read operation, and a limited buffer size for the data being returned.

Reproduction:

using var con = new SqlConnection("Server=localhost;Database=OctopusDeploy;Trusted_Connection=True;Encrypt=False");
await con.OpenAsync();

var command = con.CreateCommand();
command.CommandText = "SELECT * FROM [dbo].[Deployment] WHERE [Id] = 'Deployments-5204'";

CommandBehavior behaviour = default;
//behaviour = CommandBehavior.SequentialAccess;
var reader = await command.ExecuteReaderAsync(behaviour);

var sw = Stopwatch.StartNew();
await reader.ReadAsync();
//reader.Read();

reader["Id"].Dump();
((string) reader["Json"]).Length.Dump();

sw.Elapsed.Dump();

@droyad
Copy link
Contributor Author

droyad commented Mar 8, 2023

We should also look at using SingleResult everywhere because in almost all cases (dashboard being an exception) we only return a single result set.

Also SingleRow should be used when reading Scalars, etc (it is not in some places)

@droyad
Copy link
Contributor Author

droyad commented May 9, 2023

We've also found that the code path taken with QuerySql when UseCteBasedListWithCount feature flag is on also doesn't apply this flag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant