Regular expression for matrix information. I.e. parse structured blocks of information from csv or excel files (or similar 2d matrixes)
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.paket
docsrc
src/Zander
tests
.gitattributes
.gitignore
.travis.yml
LICENSE.txt
README.md
RELEASE_NOTES.md
Zander.sln
appveyor.yml
build.cmd
build.fsx
build.proj
build.sh
paket.dependencies
paket.lock
paket.references

README.md

Zander Build status Build Status

Named after the fish: Zander. It's a small library to ease with parsing structured blocks of information within a 2-dimensional matrix of information. Typically you get this sort of information from report generators. You might still want to extract this information programmatically, thus the need for the fish.

What problem does this library solve?

When you have data in a structured format, but with different blocks of information. A very simple example is the following:

     Report Title   16/09/15 16:17Page: 1
Company AB           
Some text           
that goes on and explains the report           
 Id ValueType  Attribute 1 Attribute 2  
 1244 25A       
 1244 25B  255 155  
 1244 25C       
 1250 25B  255 100  
 1250 25C       
      Report Title   16/09/15 16:17Page: 2
Company AB           
Some text           
that goes on and explains the report           
 Id ValueType  Attribute 1 Attribute 2  
 1251 25A  255    
 1251 25B    130  
 1251 25C       
 1260 25A       
 1260 25B  255 15  
 1260 25C      

But the structure of the block layout might change from "page" to "page".

How do you match?

Match columns

  • Use _ to indicate that there should be an empty column
  • Use "Some constant" or constant to indicate a column with a constant value
  • Use @Value to indicate that you want the value on that column
  • Use ( .. | .. ) to match any of

Match rows

In order to match rows you supply the row specification with a name by postfixing with : title If you want the row to match many rows with the same format you add a '+' : : title+

How does it look?

How do you use this library to extract the information above? You use the parser builder:

using Zander;
...
 var parsed = new BlockEx( @" _          _ _ _ _ _ ""Report Title"" _  _  _  @Time @Page : report_title
                                ""Company AB"" _ _ _ _ _ _                _ _ _ _  _          : company
                                    @Text      _ _ _ _ _ _                _ _ _ _  _          : text+
                                  _         Id _  Value  Type _ _ ""Attribute 1"" _ ""Attribute 2"" _  _ : header
                                  _        @Id _ @Value @Type _ _ (@Attribute1|_) _ (@Attribute2|_) _  _ : row+
                    ")
                .Matches(arrayOfArrays);

This will give you structured information that will be easy to consume.