Skip to content

Commit

Permalink
Adding an Operator File Parser - rough RegEx
Browse files Browse the repository at this point in the history
  • Loading branch information
alterisian committed Oct 30, 2010
1 parent d5aba47 commit 45029bf
Show file tree
Hide file tree
Showing 2 changed files with 121 additions and 0 deletions.
105 changes: 105 additions & 0 deletions data/gmpte_foi/FARETABLES/Operators/Operators.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
OPERATOR TRADE NAME
ABC Atherton Bus Company
ACT Accrington Transport
ARL Arriva Liverpool
ASH Ashall's Coaches
ATC Aintree Coachlines
ATV Amber Travel
BAK Bakerbus
BBC Bluebird Bus & Coach
BBL Blue Bus
BBS Blue Bus
BBT Blackburn Transport
BDT Blackburn with Darwen T'sport
BEE Arriva Manchester
BEL South Lancs Travel
BEV Belle Vue Coaches
BPT Burnley & Pendle
BRA Bradshaws Coaches
BUL R Bullock Buses
BUV Bu-Val Buses
BWS Bowers Coaches
CAL First Yorkshire West
CHC Chester's Coaches
CLK Click Services
CNC Clifton Coaches
COA The Coachmasters
COU Courtesy Coaches
CRO Arriva Cymru
CSL Coachways
CSS Coach Services
CTY County Coaches
DEN Dennis's Coaches
DGC D & G Coach & Bus
DVY Douglas Valley Bus & Coach
ELS Ellen Smith Tours
FCH Finches
FIN Finglands
FPR First Pioneer
FRE Freebird Coaches
GCO G B Coaches
GMB GM Buses
GMN First
GMS Stagecoach Manchester
GOO Go Goodwins
GTB South Lancs Travel
HAY Hayton's Coaches
HBC Huddersfield Bus Company
HHC Hulme Hall Coaches
HTL HTL Buses
JPX J P Travel
JSC Jim Stones Coaches
KFR First
KMT K'Matt Coaches
LAC L.A. Coaches
LCL Ashall's Coaches
LTL Leisure Travel
LUL Lancashire United
MAY Mayne of Manchester
MET Metrolink
MHT M & H Travel
MRT M R Travel
MTL Arriva Merseyside
MTR M Travel Mini Buses
MTT Maytree Travel
MYB U. K. North
NBE Northern Blue
NOR Arriva North West
OLY Olympia Travel
PIO Pioneer Travel
PMT First
RAM Houston Ram
RDT Rossendale Transport
RIT Rothwells Super Travel
RKT Red Kite Travel
RMS Stagecoach In Lancashire
RSL Strawberry Bus
RWC Ringwood Luxury Coaches
RWY Ringway Minibus
SFC Springfield Coachways
SPL Speedwellbus
SPO Pioneer Travel
STE Arriva Midlands North
STO Stott's of Oldham
SUR Sureway Travel
SWA Swans Travel
TAN Checkmate Mini Coaches
TLB Town Lynx Buses
TLT Timeline Travel
TML Travel Master
TMT Trent Buses
TSB Tattersall's Bus & Coach
TTL TM Travel
TTS Thor Tpt (assoc with Bluebird
TWM Williams Coaches
TYT Tyrer Bus
VAL Vale of Manchester
VIK Viking Coaches
VLL Vale of Llangollen Travel
WBC Arriva Manchester (Wigan Area)
WBL Wigan Buses
WBT Warrington Borough Transport
WCW Bennett's
WES Midwest Buses
WML Manchester Community Transport
WTZ W. Tresize
16 changes: 16 additions & 0 deletions lib/read_operators.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Read The Operators and populate the Operator Model
filename = '../data/gmpte_foi/FARETABLES/Operators/Operators.txt'
line_num=0
file = File.open(filename, "r")
file.each_line do |line|
match=line.scan /([A-Z][A-Z][A-Z]) ([\w ]* )/
if line_num > 0 and !match.first.nil?
puts "code:"+match.first.first unless match.first.first.nil?
puts "name:"+match.first.last unless match.first.last.nil?
end
line_num+=1
end

puts "read #{line_num} lines"


5 comments on commit 45029bf

@ciaranj
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay for regexes (my fave!) ... does yours match correctly though, I'd expect that to screw up on 'Manchester Community Transport' ? ... it should be extracting 'Manchester Community' ..is it not ?

Fwiw .. being a total regex geek, I'd go with:

 \s+([^\s]{3})\s+(.+)

as your data is pretty specific and you're not really trying to validate it, just parse it :)

@alterisian
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, looking at it the ( ) 's will also mess up. I'm refreshing my regex, so appreciate the help.

@ciaranj
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how the parenthesis mess it up, but turns out this site: http://regexpal.com/ is evil ;) .. this site isn't: http://www.regexplanet.com/simple/index.html ... the first one had me questioning my regex abilities for 15 minutes, arghh! :) ... I've updated my previous expression to demonstrate negated character clasess, cos they're handy for you here.

@alterisian
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cheers dude. Will look at this when I next get some time again - probably next weekend. http://rubular.com is pretty good too :)

@ciaranj
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries dude ;) .. Just finishing up some work on http://github.com/ciaranj/node-oauth myself ;0 Time for sleepy byes..

Please sign in to comment.