Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System.Xml.XmlTextReaderImpl could possibly benefit from Span #59829

Open
johnthcall opened this issue Sep 30, 2021 · 2 comments
Open

System.Xml.XmlTextReaderImpl could possibly benefit from Span #59829

johnthcall opened this issue Sep 30, 2021 · 2 comments
Assignees
Labels
area-System.Xml tenet-performance Performance related issue
Milestone

Comments

@johnthcall
Copy link
Contributor

johnthcall commented Sep 30, 2021

Description

In the Azure SDK ecosystem errors are serialized by the server as XML and parsed into XDocument, In the stack trace below the string message is wrapped into a TextReader which is then passed down to the XmlTextReaderImpl where a buffer is allocated. Even in the case when InitTextReaderInput is called with a string source instead of a TextReader the entire string is copied to a new char[] with a (char)0 ending.

It seems that if span can be leveraged that significant performance could be gained. If no one sees a clear path for this, no worries just want to make sure there isn't any low hanging fruit here.

Configuration

  • Which version of .NET is the code running on? 5.0.9
  • What OS version, and what distro if applicable? Windows Server 2019
  • What is the architecture (x64, x86, ARM, ARM64)? x64

Data

clr.dll!JIT_NewArr1
System.Xml.ni.dll!System.Xml.XmlTextReaderImpl.InitTextReaderInput(System.String, System.Uri, System.IO.TextReader)$##6000B97
System.Xml.ni.dll!System.Xml.XmlTextReaderImpl.FinishInitTextReader()$##6000B10
System.Xml.ni.dll!System.Xml.XmlReaderSettings.CreateReader(System.IO.TextReader, System.String, System.Xml.XmlParserContext)$##6000A31
System.Xml.Linq.ni.dll!System.Xml.Linq.XDocument.Parse(System.String, System.Xml.Linq.LoadOptions)$##60001A0
Azure.Core.Pipeline.StorageClientDiagnostics!ExtractFailureContent

Analysis

I initially tried altering the following by verifying that 0 ending array is required and making it a direct copy of the string does cause two unit tests to break.

_ps.chars = new char[len + 1];
str.CopyTo(0, _ps.chars, 0, str.Length);
_ps.charsUsed = len;
_ps.chars[len] = (char)0;

If the 0 ending array is strictly required then a copy of the full array will always be required.

I'm happy to contribute changes here but don't know if it is feasible or the best place to start.

@johnthcall johnthcall added the tenet-performance Performance related issue label Sep 30, 2021
@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.Xml untriaged New issue has not been triaged by the area owner labels Sep 30, 2021
@ghost
Copy link

ghost commented Sep 30, 2021

Tagging subscribers to this area: @dotnet/area-system-xml
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

In the Azure SDK ecosystem errors are serialized by the server as XML and parsed into XDocument, In the stack trace below the string message is wrapped into a TextReader which is then passed down to the XmlTextReaderImpl where a buffer is allocated. Even in the case when InitTextReaderInput is called with a string source instead of a TextReader the entire string is copied to a new char[] with a (char)0 ending.

It seems that if span can be leveraged that significant performance could be gained. If no one sees a clear path for this, no worries just want to make sure there isn't any low hanging fruit here.

Configuration

  • Which version of .NET is the code running on? 5.0.9
  • What OS version, and what distro if applicable? Windows Server 2019
  • What is the architecture (x64, x86, ARM, ARM64)? x64

Data

clr.dll!JIT_NewArr1
System.Xml.ni.dll!System.Xml.XmlTextReaderImpl.InitTextReaderInput(System.String, System.Uri, System.IO.TextReader)$##6000B97
System.Xml.ni.dll!System.Xml.XmlTextReaderImpl.FinishInitTextReader()$##6000B10
System.Xml.ni.dll!System.Xml.XmlReaderSettings.CreateReader(System.IO.TextReader, System.String, System.Xml.XmlParserContext)$##6000A31
System.Xml.Linq.ni.dll!System.Xml.Linq.XDocument.Parse(System.String, System.Xml.Linq.LoadOptions)$##60001A0
Azure.Core.Pipeline.StorageClientDiagnostics!ExtractFailureContent

Analysis

I initially tried altering the following by verifying that 0 ending array is required and making it a direct copy of the string does cause two unit tests to break.

_ps.chars = new char[len + 1];
str.CopyTo(0, _ps.chars, 0, str.Length);
_ps.charsUsed = len;
_ps.chars[len] = (char)0;

If the 0 ending array is strictly required then a copy of the full array will always be required.

Author: johnthcall
Assignees: -
Labels:

area-System.Xml, tenet-performance, untriaged

Milestone: -

@ghost ghost added this to Needs triage in Triage POD for Meta, Reflection, etc Sep 30, 2021
@eiriktsarpalis
Copy link
Member

Hi @johnthcall, we'd be happy to consider a PR that adds spanification improvements to the XmlTextReader implementation. Would it be possible to also share benchmarks that demonstrate the perf improvements?

@eiriktsarpalis eiriktsarpalis removed the untriaged New issue has not been triaged by the area owner label Oct 1, 2021
@eiriktsarpalis eiriktsarpalis added this to the Future milestone Oct 1, 2021
@buyaa-n buyaa-n removed this from Needs triage in Triage POD for Meta, Reflection, etc Oct 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Xml tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

2 participants