This Ruby script process all the xlsx files in a folder (expecting to be response files from tuition providers all sticking to the same template) and aggregates the parsed information into several csv files.
Validation errors are displayed in the case of issues finding, accessing or parsing the spreadsheets in the given path.
The implementation is based in several components:
-
A simple Ruby script located in /bin folder that delegates the parsing job to the next 2 Ruby classes and displays stats with the results.
-
The
FolderParserclass that finds the spreadsheets files and stores extracted data models in memory data structures during the parsing process. -
The
FileParserthe main class that opens a file and extract provider details, pricing and locations served by the tuition provider. -
The
roogem (open source library) to read and deal with spreadsheets in Ruby. -
The
csvstandard Ruby library to generate csv files. -
Several Service and Report classes to help in the generation of csv files from the extracted Ruby objects.
git clone git@github.com:DFE-Digital/tuition_parser.git
cd tuition_parserIf Ruby runtime is not installed in your machine, please install it first. Some popular ruby version managers might be helpful: rbenv, rvm
You will need Bundler gem to install Ruby dependencies. In the root directory of the cloned script check it is installed and accessible typing:
bundle -vInstall it if you need to:
gem install bundlerOnce installed, the next command will install all Ruby dependencies for the script to run properly
bundle install./bin/parse.rb path_to_spreadsheets_folderDepending on the contents of your folder it will display something similar to:
~/scripts/tuition_parser $ ./bin/parse.rb ../../spreadsheets
Installing dependencies...
Processing /Users/ltello/spreadsheets/Provider1.xlsx
Processing /Users/ltello/spreadsheets/Provider2.xlsx
Processing /Users/ltello/spreadsheets/Provider3.xlsx
Processing /Users/ltello/spreadsheets/Provider4.xlsx
Processing /Users/ltello/spreadsheets/Provider5.xlsx
Providers: 36 (Check generated file: /Users/ltello/spreadsheets/providers.csv)
Prices: 2588 (Check generated file: /Users/ltello/spreadsheets/prices.csv)
Regions: 9 (Check generated file: /Users/ltello/spreadsheets/regions.csv)
LADs: 309 (Check generated file: /Users/ltello/spreadsheets/lads.csv)
Locations: 599 (Check generated file: /Users/ltello/spreadsheets/locations.csv)
~/scripts/tuition_parser $