Skip to content

Categorise CSV cells based on string matching and Regex

Notifications You must be signed in to change notification settings

isaacy2012/CSVC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CSVC

Categorise CSV cells based on string matching and Regex.

By parsing a config file, CSVC automatically categorizes the CSV cells.

Screenshot

Parameters

noheader means the input CSV file doesn't contain a header, so matching can begin from the first (0 indexed) row

column NUMLIST sets the columns of interest to the list of numbers. e.g column 0,1,2 will mean CSVC searches through columns 0,1, and 2 to attempt to match.

MODE controls how CSVC will write the substitutions. append simply appends the substitution to the end of the line, replace NUM replaces column NUM with the substitution, and inserts NUM QUOTE inserts a column with the name QUOTE at index NUM.

Grammar

The grammar for the config file is as follows:

CONFIG ::= HEADER? COLUMN? MODE CATEGORY+
HEADER ::= header OPTSEMI
COLUMN ::= column NUMLIST OPTSEMI
NUMLIST ::= NUM [, NUM]*
MODE ::= append OPTSEMI replace NUM OPTSEMI | insert NUM QUOTE OPTSEMI
CATEGORY ::= category QUOTE { RULESET* }
RULESET ::= STRINGRULETYPE | REGEXRULETYPE
STRINGRULETYPE ::= RULETYPE { RULE* }
RULETYPE ::= equals | contains
REGEXRULTEYPE ::= matches { REGEXRULE* }
RULE ::= QUOTE OPTSEMI
REGEXRULE ::= REGEX OPTSEMI
QUOTE ::= "[a-zA-Z ]+"
REGEX ::= ".*"
NUM ::= [0-9]+
OPTSEMI ::= ;?

e.g

noheader
column 0,1
replace 1
category "Food And Drink" {
    contains {
        "Rest"
        "Mc Donalds"
        "Maccas"
    }
}
category "Bad" {
    contains {
        "Key"
    }
}
category "Groceries" {
    equals {
        "New World"
    }
    contains {
        "New World"
        "CountDown"
    }
    matches {
        "b.*a"
    }
}

About

Categorise CSV cells based on string matching and Regex

Resources

Stars

Watchers

Forks

Releases

No releases published

Languages