Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better way to skip extraneous lines at the start? #890

Closed
rorourke-iot opened this issue Jan 3, 2018 · 1 comment
Closed

Better way to skip extraneous lines at the start? #890

rorourke-iot opened this issue Jan 3, 2018 · 1 comment

Comments

@rorourke-iot
Copy link

rorourke-iot commented Jan 3, 2018

I'm doing this example in LINQPad. CsvHelper is pulled in as a Nuget package with the appropriate namespaces added to the code.

I have a set of files I'm reading. Example content included in code sample. The files contain data at the front of which should be skipped. I was using a custom CSV processor which allowed me to specify how many lines to skip before starting processing on the file. Using CsvHelper, I'm trying to handle these lines. Below is my current take on it. But you can see I originally had a set of explicit reads. I'd initially thought about seeing if the Read method could take an int param to specify a number of consecutive reads (this might still be a good idea, but seems unnecessary for my specific case). I felt this was brittle as I may want to have other comments or content at the top of the files.

I could also comment the "version" line, but this convention is prevalent in other data files (non-CSV) used in the application. I don't want to change this without a good reason.

Is the approach below my best option for handing this content?

void Main()
{
  var data = @"; Gen 1.5 Package Repository List
; this file contains a list of packages with version and release

version=1
package,action,data
hotfix-monitorix-lighttpd-2.5.3-1.el6.noarch,remove,
monitorix-lighttpd-2.5.2-1.el6.noarch,remove,
openssl,removearch,i686
powerctl,removeolder,2.8.0-2.NS.el6
bash-4.1.2-15.el6_5.2.x86_64.rpm,install,
cmulogd-1.6.0-2.el6.pse.x86_64.rpm,install,";
  
  using (var reader = new StringReader(data))
  using (var csv = new CsvReader(reader))
  {
    csv.Configuration.RegisterClassMap<RpmRepoMap>();
    csv.Configuration.IgnoreBlankLines = true;
    csv.Configuration.AllowComments = true;
    csv.Configuration.Comment = ';';
    csv.Configuration.ShouldSkipRecord = content =>
    {
      if (content[0].StartsWith("version"))
        return true;
        
      if (content[0] == "package")
      {
        csv.ReadHeader();
        
        return true;
      }
      
      return false;
    };
    
//    csv.Read();
//    csv.Read();
//    csv.Read();
//    csv.Read();
//    csv.ReadHeader();
    csv.GetRecords<RpmRepo>().Dump();
  }
}

class RpmRepo
{
  public string Package { get; set; }
  public string Action { get; set; }
  public string Data { get; set; }
}

class RpmRepoMap : ClassMap<RpmRepo>
{
  public RpmRepoMap()
  {
    Map(m => m.Package).Name("package");
    Map(m => m.Action).Name("action");
    Map(m => m.Data).Name("data");
  }
}
@JoshClose
Copy link
Owner

If it works for all files you'll be reading, it seems fine to me.

Here are a couple other ways you could do it.

// Skip 4 rows.
for (var i = 0; i < 4; i++) 
{
    csv.Read();
}

// Skip until version= is found.
while (csv.Read())
{
    if (csv.Context.Record[0].StartsWith("version="))
    {
        csv.Read();
        csv.ReadHeader();
        break;
    }
}     

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants