System.Xml.XmlTextReaderImpl could possibly benefit from Span #59829

johnthcall · 2021-09-30T21:25:31Z

Description

In the Azure SDK ecosystem errors are serialized by the server as XML and parsed into XDocument, In the stack trace below the string message is wrapped into a TextReader which is then passed down to the XmlTextReaderImpl where a buffer is allocated. Even in the case when InitTextReaderInput is called with a string source instead of a TextReader the entire string is copied to a new char[] with a (char)0 ending.

It seems that if span can be leveraged that significant performance could be gained. If no one sees a clear path for this, no worries just want to make sure there isn't any low hanging fruit here.

Configuration

Which version of .NET is the code running on? 5.0.9
What OS version, and what distro if applicable? Windows Server 2019
What is the architecture (x64, x86, ARM, ARM64)? x64

Data

clr.dll!JIT_NewArr1
System.Xml.ni.dll!System.Xml.XmlTextReaderImpl.InitTextReaderInput(System.String, System.Uri, System.IO.TextReader)$##6000B97
System.Xml.ni.dll!System.Xml.XmlTextReaderImpl.FinishInitTextReader()$##6000B10
System.Xml.ni.dll!System.Xml.XmlReaderSettings.CreateReader(System.IO.TextReader, System.String, System.Xml.XmlParserContext)$##6000A31
System.Xml.Linq.ni.dll!System.Xml.Linq.XDocument.Parse(System.String, System.Xml.Linq.LoadOptions)$##60001A0
Azure.Core.Pipeline.StorageClientDiagnostics!ExtractFailureContent

Analysis

I initially tried altering the following by verifying that 0 ending array is required and making it a direct copy of the string does cause two unit tests to break.

runtime/src/libraries/System.Private.Xml/src/System/Xml/Core/XmlTextReaderImpl.cs

Lines 3058 to 3061 in 918aaec

    
           _ps.chars = new char[len + 1]; 
        
           str.CopyTo(0, _ps.chars, 0, str.Length); 
        
           _ps.charsUsed = len; 
        
           _ps.chars[len] = (char)0;

If the 0 ending array is strictly required then a copy of the full array will always be required.

I'm happy to contribute changes here but don't know if it is feasible or the best place to start.

ghost · 2021-09-30T21:25:36Z

Tagging subscribers to this area: @dotnet/area-system-xml
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

In the Azure SDK ecosystem errors are serialized by the server as XML and parsed into XDocument, In the stack trace below the string message is wrapped into a TextReader which is then passed down to the XmlTextReaderImpl where a buffer is allocated. Even in the case when InitTextReaderInput is called with a string source instead of a TextReader the entire string is copied to a new char[] with a (char)0 ending.

It seems that if span can be leveraged that significant performance could be gained. If no one sees a clear path for this, no worries just want to make sure there isn't any low hanging fruit here.

Configuration

Which version of .NET is the code running on? 5.0.9
What OS version, and what distro if applicable? Windows Server 2019
What is the architecture (x64, x86, ARM, ARM64)? x64

Data

clr.dll!JIT_NewArr1
System.Xml.ni.dll!System.Xml.XmlTextReaderImpl.InitTextReaderInput(System.String, System.Uri, System.IO.TextReader)$##6000B97
System.Xml.ni.dll!System.Xml.XmlTextReaderImpl.FinishInitTextReader()$##6000B10
System.Xml.ni.dll!System.Xml.XmlReaderSettings.CreateReader(System.IO.TextReader, System.String, System.Xml.XmlParserContext)$##6000A31
System.Xml.Linq.ni.dll!System.Xml.Linq.XDocument.Parse(System.String, System.Xml.Linq.LoadOptions)$##60001A0
Azure.Core.Pipeline.StorageClientDiagnostics!ExtractFailureContent

Analysis

I initially tried altering the following by verifying that 0 ending array is required and making it a direct copy of the string does cause two unit tests to break.

runtime/src/libraries/System.Private.Xml/src/System/Xml/Core/XmlTextReaderImpl.cs

Lines 3058 to 3061 in 918aaec

    
           _ps.chars = new char[len + 1]; 
        
           str.CopyTo(0, _ps.chars, 0, str.Length); 
        
           _ps.charsUsed = len; 
        
           _ps.chars[len] = (char)0;

If the 0 ending array is strictly required then a copy of the full array will always be required.

Author:	johnthcall
Assignees:	-
Labels:	`area-System.Xml`, `tenet-performance`, `untriaged`
Milestone:	-

eiriktsarpalis · 2021-10-01T08:32:33Z

Hi @johnthcall, we'd be happy to consider a PR that adds spanification improvements to the XmlTextReader implementation. Would it be possible to also share benchmarks that demonstrate the perf improvements?

johnthcall added the tenet-performance Performance related issue label Sep 30, 2021

dotnet-issue-labeler bot added area-System.Xml untriaged New issue has not been triaged by the area owner labels Sep 30, 2021

ghost added this to Needs triage in Triage POD for Meta, Reflection, etc Sep 30, 2021

eiriktsarpalis removed the untriaged New issue has not been triaged by the area owner label Oct 1, 2021

eiriktsarpalis added this to the Future milestone Oct 1, 2021

eiriktsarpalis assigned johnthcall Oct 1, 2021

buyaa-n removed this from Needs triage in Triage POD for Meta, Reflection, etc Oct 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

System.Xml.XmlTextReaderImpl could possibly benefit from Span #59829

System.Xml.XmlTextReaderImpl could possibly benefit from Span #59829

johnthcall commented Sep 30, 2021 •

edited

ghost commented Sep 30, 2021

Description

Configuration

Data

Analysis

eiriktsarpalis commented Oct 1, 2021

System.Xml.XmlTextReaderImpl could possibly benefit from Span #59829

System.Xml.XmlTextReaderImpl could possibly benefit from Span #59829

Comments

johnthcall commented Sep 30, 2021 • edited

Description

Configuration

Data

Analysis

ghost commented Sep 30, 2021

Description

Configuration

Data

Analysis

eiriktsarpalis commented Oct 1, 2021

johnthcall commented Sep 30, 2021 •

edited