Because I wrote this when I was in bed with the flu 😰.
A URL parser - but mostly an exercise in string manipulation with Ruby 2.5.1 and testing with RSpec 3.8+
- Will parse most commonly found URLs and store each part in a hash.
- Path segments are seperated into an array for convenience.
- Querystrings are seperated into a hash for convenience.
- Does not support IPv6 host names (maybe one day) - raises StandardError
- Raises StandardError for non-numeric port numbers.
- See lib/death-bed.rb for code. Heaps of comments - pretty ugly. I chose against abstracting code into too many methods. The logic to parse the URL is so tightly coupled that it made sense (against all other urges) to keep most it together.
- See spec/URLParserSpec for tests which contain a large range of URL format combinations.
$ gem install rspec
String 👉 include?(string)
Need to know if a string contains a character or sequence of characters? In this example I was looking for the double slashes (//) which denote the beginning of the Authority component...
irb(main):002:0> str = "foo://example.com"
=> "foo://example.com"
irb(main):003:0> str.include? "//"
=> true
...
In this example I remove the question mark (?) character which denotes the beginning of the querystring...
irb(main):004:0> str = "?fname=rach¤tstatus=sickasadog"
=> "?fname=rach¤tstatus=sickasadog"
irb(main):005:0> str[1..-1]
=> "fname=rach¤tstatus=sickasadog"
irb(main):006:0>
String 👉 split(string)
In this example I seperate each querystring - the ampersand character (&) seperates each pair. Split returns all of them in an array...
irb(main):013:0> str = "fname=rach¤tstatus=sickasadog"
=> "fname=rach¤tstatus=sickasadog"
irb(main):014:0> pairs = str.split("&")
=> ["fname=rach", "currentstatus=sickasadog"]
I can then split again, this time on each pair, on the equals character (=), in order to read the keys and values individually.
String 👉 partition(string)
(AKA my new best friend 😍)
In this example I need to get the user info on the left of the @ and the host name on the right. partition will locate the first occurance of a given character and then return an array containing the preceeding characters, the given character, and the characters which follow. Nice!
irb(main):015:0> str = "user@example.com"
=> "user@example.com"
irb(main):016:0> str.partition("@")
=> ["user", "@", "example.com"]
There's also rpartition, which starts the search from the end of the string. Groovy!
String 👉 slice!(regular expression)
Need to remove a character sequence from your string? In this example I remove the double slashes (//) which denote the beginning of the Authority component...
irb(main):023:0> str = "foo://example.com"
=> "foo://example.com"
irb(main):024:0> str.slice!(/\/\//)
=> "//"
irb(main):025:0> str
=> "foo:example.com"
There's also a non-destructive version (omit the !).
String 👉 count(string)
Need to count the occurance of a character or character sequence in your string? In this example I count the number of path segment seperators (slash) slash...
irb(main):028:0> str = "this/is/5/path/segments/"
=> "this/is/5/path/segments/"
irb(main):029:0> str.count("/")
=> 5
- Add support for IPv6 host names.
- Can you think of others?