Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V3 client avoid stream copy for synchroneous requests. #656

Closed

Conversation

Projects
None yet
4 participants
@uffelauesen
Copy link

commented Aug 1, 2016

Issues

This pull request fixes issue #655.

Description

Http Response Stream copying removed from the synchroneous Execute query path. The responseMessage in QueryResult is sent down to the materializer for it to dispose when done enumerating.

Checklist (Uncheck if it is not completed)

  • [ ] Test cases added
  • [ ] Build and test with one-click build and test script passed

Additional work necessary

If documentation update is needed, please add "Docs Needed" label to the issue and provide details about the required document change in the issue.

Removed in-memory response steam copying from the synchroneous Execut…
…Query path. The http response stream is now used directly. Disposing is deferred untill enumeration has completed.
@msftclas

This comment has been minimized.

Copy link

commented Aug 1, 2016

Hi @uffelauesen, I'm your friendly neighborhood Microsoft Pull Request Bot (You can call me MSBOT). Thanks for your contribution!

In order for us to evaluate and accept your PR, we ask that you sign a contribution license agreement. It's all electronic and will take just minutes. I promise there's no faxing. https://cla.microsoft.com.

TTYL, MSBOT;

@msftclas

This comment has been minimized.

Copy link

commented Aug 2, 2016

@uffelauesen, Thanks for signing the contribution license agreement so quickly! Actual humans will now validate the agreement and then evaluate the PR.

Thanks, MSBOT;

@msftclas msftclas added cla-signed and removed cla-required labels Aug 2, 2016

@@ -87,14 +88,15 @@ public HttpWebResponseMessage(HttpWebResponse httpResponse)
/// <param name="headers">The headers.</param>
/// <param name="statusCode">The status code.</param>
/// <param name="getResponseStream">A function returning the response stream.</param>
internal HttpWebResponseMessage(HeaderCollection headers, int statusCode, Func<Stream> getResponseStream)
internal HttpWebResponseMessage(HeaderCollection headers, int statusCode, Func<Stream> getResponseStream, IODataResponseMessage underlyingResponseMessage)

This comment has been minimized.

Copy link
@Zoe-ms

Zoe-ms Sep 5, 2016

Contributor

getResponseStream and underlyingResponseMessage are kind of dup. Why not add an overloaded contructor to accept underlyingResponseMessage only, so other caller does not need change.

This comment has been minimized.

Copy link
@uffelauesen

uffelauesen Sep 5, 2016

Author

Yes – good idea, I’ll fix this, we do however still need the Func<> parameter to avoid GetStream() being called multiple times - you have a Debug.Assert preventing this.

@@ -103,6 +108,7 @@ internal class MaterializeAtom : IDisposable, IEnumerable, IEnumerator
this.expectingPrimitiveValue = PrimitiveType.IsKnownNullableType(elementType);

Debug.Assert(responseMessage != null, "Response message is null! Did you mean to use Materializer.ResultsWrapper/EmptyResults?");
this.responseMessage = responseMessage;

This comment has been minimized.

Copy link
@Zoe-ms

Zoe-ms Sep 5, 2016

Contributor

Why do we need back up responseMessage here? Seems it is not used in this file.

This comment has been minimized.

Copy link
@uffelauesen

uffelauesen Sep 5, 2016

Author

We hold on to it so we can dispose it off when the MaterializeAtom is disposed.

This comment has been minimized.

Copy link
@Zoe-ms

Zoe-ms Sep 6, 2016

Contributor

Then what about other places? Seems responseMessage will be passed to lots of places, for example, BatchResult...

@@ -740,5 +718,20 @@ private MaterializeAtom CreateMaterializer(ProjectionPlan plan, ODataPayloadKind
responseMessageWrapper,
payloadKind);
}

// $count queries need a way to dispose responseMessage as no enummeration occurs.

This comment has been minimized.

Copy link
@Zoe-ms

Zoe-ms Sep 5, 2016

Contributor

What about LoadPropertyResult which inherits from QueryResult? Seems this dispose() will not be called for LoadProperty...

This comment has been minimized.

Copy link
@uffelauesen

uffelauesen Sep 5, 2016

Author

No need to. The Dispose is there for requests that does not enumerate to have a way of disposing the response stream. LoadProperty enummerate and the HttpWebResponseMessage and the wrapped underlying response stream is exposed in this scenario just like when doing a regular Query. Only the /$Count Query needs the Dispose explicit.

Review taken into account. Added an overloaded constructor to handle …
…this new stream execution path to avoid changes to BatchSaveResult and SaveResult. Also fixed Dispose logic in HttpWebResponseMessage to ensure underlyingResponseMessage isn't disposed twice.
/// <param name="statusCode">The status code.</param>
/// <param name="getResponseStream">A function returning the response stream.</param>
/// <param name="underlyingResponseMessage">The underlying response message that should be disposed together with this instance.</param>
internal HttpWebResponseMessage(HeaderCollection headers, int statusCode, Func<Stream> getResponseStream, IODataResponseMessage underlyingResponseMessage)

This comment has been minimized.

Copy link
@Zoe-ms

Zoe-ms Sep 6, 2016

Contributor

getResponseStream and underlyingResponseMessage are dup. getResponseStream can be removed, and set this.getResponseStream to underlyingResponseMessage.getStream

This comment has been minimized.

Copy link
@uffelauesen

uffelauesen Sep 6, 2016

Author

I tried this already. A debug build will fail on this assert in HttpWebResponse.GetStream():
Debug.Assert(!this.streamReturned, "The GetStream can only be called once.");
As there are actually two HttpWebResponseMessages, the first one is constructed from HttpWebResponseMessage(HttpWebResponse httpResponse), the other one constructed from HttpWebResponseMessage(HeaderCollection headers, int statusCode, Func getResponseStream, IODataResponseMessage underlyingResponseMessage) that actually wraps the first one. This would lead to GetStream being called twice on the first HttpWebResponseMessage.

@@ -36,7 +36,7 @@ namespace System.Data.Services.Client
/// <summary>
/// Wrapper HttpWebRequest &amp; HttWebResponse
/// </summary>
internal class QueryResult : BaseAsyncResult
internal class QueryResult : BaseAsyncResult, IDisposable

This comment has been minimized.

Copy link
@Zoe-ms

Zoe-ms Sep 6, 2016

Contributor

BaseAsyncResult has a Dispose function, can we reuse that?

This comment has been minimized.

Copy link
@uffelauesen

uffelauesen Sep 6, 2016

Author

Agreed. I do not like the addition of IDisposable to QueryResult. It is after all the the IODataResponseMessage we want to dispose, not the QueryResult it self. I will implement another solution for allowing the syncronous GetQuerySetCount ($count ) to dispose the message.

GC.SuppressFinalize(this);
}

protected virtual void Dispose(bool disposing)

This comment has been minimized.

Copy link
@Zoe-ms

Zoe-ms Sep 6, 2016

Contributor

When will the input parameter be false?

This comment has been minimized.

Copy link
@uffelauesen

uffelauesen Sep 6, 2016

Author

I'll remove this part and find another solution.

Review taken into account. QueryResult are no longer IDisposable inst…
…ead the internal ExecuteQuery now returns the constructed IODataResponseMessage for callers to dispose off when needed (only $count queries need this).
@@ -69,6 +69,11 @@ internal class MaterializeAtom : IDisposable, IEnumerable, IEnumerator
private bool moved;

/// <summary>
/// The ODataResponseMessage to dispose when we are done.
/// </summary>
private IODataResponseMessage responseMessage;

This comment has been minimized.

Copy link
@Zoe-ms

Zoe-ms Sep 7, 2016

Contributor

Seems MaterializeFromAtom.Dispose will not be called explicitly. One of our cases is failed because the response message is not disposed. What about like handling $count, we use a "finally" of every caller of ExecuteQuery() to dispose response message?

This comment has been minimized.

Copy link
@uffelauesen

uffelauesen Sep 7, 2016

Author

The idea is to keep the response message (or more specifically the http connection) alive during enumeration so the materializing process can read the data directly from the wire. If I dispose of the response message (and thus closing the http connection) after ExecuteQuery() have been called (like in the $count code path), the enumeration will fail as it cannot read from the closed http connection.

Could you elobarate on what the failing case specifically is?

This comment has been minimized.

Copy link
@Zoe-ms

Zoe-ms Sep 7, 2016

Contributor

DisposeShouldBeCalledOnResponseMessageForExecuteWithNoContent,

Code is like:
context.Execute(new Uri("http://host/voidAction", UriKind.Absolute), "POST").StatusCode.Should().Be(204);
responseMessageDisposed.Should().BeTrue();

This comment has been minimized.

Copy link
@uffelauesen

uffelauesen Sep 8, 2016

Author

Thanks, I'll look into this.

This comment has been minimized.

Copy link
@uffelauesen

This comment has been minimized.

Copy link
@Zoe-ms

Zoe-ms Sep 8, 2016

Contributor

Seems there is a side effect of the fix is that when we are doing some operation, like create an entity, it will return 201, and we will not do enumeration after the operation, then the message will not dispose.

This comment has been minimized.

Copy link
@uffelauesen

uffelauesen Sep 8, 2016

Author

Ah... Now I see. In cases where the create is genarated by hand by ctx.Execute - we do not know if the caller will actually enumerate the results. Like this?...

ctx.Execute(new Uri("http://sjunittest/OData/Files"), "POST", true, new BodyOperationParameter("odata.type","Som.File"), new BodyOperationParameter("Title","5.8.0"),new BodyOperationParameter("FileClass_Value","UT"));

I'll try to se how I can handle this.

This comment has been minimized.

Copy link
@uffelauesen

uffelauesen Sep 8, 2016

Author

Acually the same is true for GET. If you do ctx.Execute by hand - and don't do any enumeration/foreach of the result - the message will not be disposed. This has actually always been the case, but the difference is now the stream (the stuff that we are actually interested in disposing) is a http/network stream whereas before it was a memory stream. So the issue have always been there, the diference is just what type of stream we are talking about. At some point GC will handle these scenarios.

Is this correctly understood?
Is it a major issue due to the fact that the stream now is a Network stream? If so, I really do not see any way around this, except we could make this a configuration/flag that you need to enable to get this new optimized behaviour, but at the expense of you/the programmer have to ensure they enumerate the results or dispose them.

This comment has been minimized.

Copy link
@uffelauesen

uffelauesen Sep 8, 2016

Author

Aparently this will not dispose the message (or the stream)...

            var e = ctx.Execute<File>(new Uri("http://host/OData/Files"));
            var en = e.GetEnumerator();
            en.Dispose();

But this will...

            var e = ctx.Execute<File>(new Uri("http://host/OData/Files"));
            var en = e.GetEnumerator();
            en.MoveNext();
            en.Dispose();

We can fix this scenario, but should we - it seems like it has always been like this?

Should this be reported as a separate issue and PR, or can we fix this as part of this PR?

This comment has been minimized.

Copy link
@Zoe-ms

Zoe-ms Sep 9, 2016

Contributor

Yes, so before we may have body not disposed, and now we have the whole message. I agree that this can be reported as a separate issue.

@@ -47,6 +47,8 @@ public class HttpWebResponseMessage : IODataResponseMessage, IDisposable
/// <summary>HttpWebResponse instance.</summary>
private HttpWebResponse httpWebResponse;

/// <summary>The IODataResponseMessage to be disposed together with this HttpWebResponseMessage.</summary>
private IODataResponseMessage underlyingResponseMessage;

This comment has been minimized.

Copy link
@Zoe-ms

Zoe-ms Sep 7, 2016

Contributor

If we use a finally to dispose responseMessage explicitly in the caller of ExecuteQuery, seems we do not need to save the message here.

@@ -272,6 +273,10 @@ internal long GetQuerySetCount(DataServiceContext context)

throw;
}
finally
{

This comment has been minimized.

Copy link
@Zoe-ms

Zoe-ms Sep 7, 2016

Contributor

Then seems this will cause the response message disposed before enumeration as well

This comment has been minimized.

Copy link
@uffelauesen

uffelauesen Sep 8, 2016

Author

No, no enumeration occurs for GetQuerySetCount, it simply returns a long with count value returned in the message. The message body is read entirely within a stream reader, like: r = XmlConvert.ToInt64(sr.ReadToEnd()); We are done with the message after this and it is safe to dispose it off.

This comment has been minimized.

Copy link
@Zoe-ms

Zoe-ms Sep 8, 2016

Contributor

This is not for GetQuerySetCount... this function will be called by all paths

This comment has been minimized.

Copy link
@uffelauesen

uffelauesen Sep 8, 2016

Author

Sorry, I don't get that. As far as I can see the line you are referring to 276-277 is within GetQuerySetCount. GetQuerySetCount() is not called from all code paths, Am I on the wrong code line?

This comment has been minimized.

Copy link
@Zoe-ms

Zoe-ms Sep 9, 2016

Contributor

Sorry, maybe I'm on the wrong code line. Anyway, I get your point, we need explicitly dispose in GetQuerySetCount since we do not have enumeration for $count.

Review taken into account. ExecuteQuery now disposes the message expl…
…icit in special cases where the response is NoContent.
@Zoe-ms

This comment has been minimized.

Copy link
Contributor

commented Sep 9, 2016

Hi Uffe, thanks for your corporation so far. I will add some tests and then send internal code review before merging this PR. Will update if I get any feedback. Thanks.

this.SetHttpWebResponse(Util.NullCheck(response, InternalError.InvalidGetResponse));

if (HttpStatusCode.NoContent != this.StatusCode)
{
using (Stream stream = this.responseMessage.GetStream())
Stream stream = this.responseMessage.GetStream();

This comment has been minimized.

Copy link
@Zoe-ms

Zoe-ms Sep 13, 2016

Contributor

Hi Uffe, I ran a full test on this change, and unfortunately it breaks many cases. The reason is that the test code internally implements a logging stream for response and add log for every Read(). And when GetStream() from the logging stream, it will return the logged bytes. So before this change, our code will copy stream which will call Read() so every bytes will be logged. Since now there is no copy, then the GetStream() cannot return correct result. It may be a very corner scenario, but anyway it is a behavior change. So to keep the change, can we add an option to turn on null stream copy? What do you think?

This comment has been minimized.

Copy link
@uffelauesen

uffelauesen Sep 14, 2016

Author

Ok, I'll see what I can do.

Added option boolean AllowDirectNetworkStreamReading to DataServiceCo…
…ntext to enable/disable direct stream reading. Default is false.
@Zoe-ms

This comment has been minimized.

Copy link
Contributor

commented Sep 19, 2016

Merge locally, thanks!

@LaylaLiu

This comment has been minimized.

Copy link
Contributor

commented Oct 31, 2016

Checked in by commit e101af0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.