Skip to content
This repository has been archived by the owner on Aug 2, 2023. It is now read-only.

Implement DataFrame.LoadCsvFromString #2988

Merged
merged 2 commits into from
Oct 30, 2020
Merged

Conversation

pgovind
Copy link
Contributor

@pgovind pgovind commented Oct 28, 2020

Fixes #2984

/// <param name="encoding">The character encoding. Defaults to UTF8 if not specified</param>
/// <returns><see cref="DataFrame"/></returns>
public static DataFrame LoadCsv(Stream csvStream,
private static DataFrame ReadCsvLinesIntoDataFrame(IEnumerable<string> lines,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is just a refactor of the logic in LoadCsv with 1 simple change: The first parameter here is an IEnumerable<string>.


using (var streamReader = new StreamReader(csvStream, encoding ?? Encoding.UTF8, detectEncodingFromByteOrderMarks: true, DefaultStreamReaderBufferSize, leaveOpen: true))
{
CsvLineEnumerator linesEnumerator = new CsvLineEnumerator(streamReader);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple structs to wrap a StreamReader within an IEnumerable<string>.

long numberOfRowsToRead = -1, int guessRows = 10, bool addIndexColumn = false,
Encoding encoding = null)
{
string[] lines = csvString.Split(new[] { Environment.NewLine }, StringSplitOptions.None);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we just completely giving up on quoting? Or is this just a "first effort" to get the API in?

Copy link
Contributor Author

@pgovind pgovind Oct 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a "first effort". Or more accurately the easiest way to get this API in with just some minor refactoring. I'll work on a proper csv parser shortly and re-visit this implementation at that point.


public void Dispose()
{
throw new NotImplementedException();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why throw on Dispose?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put that in so we catch it if someone calls this method. We're not dealing with any unmanaged resources here, so no one should be calling Dispose

Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pgovind pgovind merged commit 344845a into dotnet:master Oct 30, 2020
@pgovind pgovind deleted the csvString branch October 30, 2020 19:06
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DataFrame] Add a LoadCsv overload or other method that takes CSV content directly
2 participants