-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement initial parser #1
Merged
Changes from 11 commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
37929c8
Implement initial parser
doomspork 5ccda3e
Convert YAML document into Elixir types
doomspork 9819a63
Create structs for Version and OperatingSystem
doomspork 30e604a
Parser base and behaviour
doomspork df0a68f
Device parser
doomspork a16f1fd
Operating system parsing
doomspork e3acdd7
Version parsing support
doomspork f9c7e01
Env based configuration
doomspork d76cfd8
User Agent parser
doomspork 35d4bc0
Simplify configuration
doomspork 203833d
Finalize user agent parsing
doomspork a5f6ab2
Clean-up os and user-agent parsing with macro
doomspork 80fc66d
Setup package details
doomspork c102502
Update README
doomspork cbdf40b
rename project to UAParser
ba0d0b0
add credo and fix its warnings for base-run (not strict mode)
ybur-yug File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,30 +1,4 @@ | ||
# This file is responsible for configuring your application | ||
# and its dependencies with the aid of the Mix.Config module. | ||
use Mix.Config | ||
|
||
# This configuration is loaded before any dependency and is restricted | ||
# to this project. If another project depends on this project, this | ||
# file won't be loaded nor affect the parent project. For this reason, | ||
# if you want to provide default values for your application for | ||
# 3rd-party users, it should be done in your "mix.exs" file. | ||
|
||
# You can configure for your application as: | ||
# | ||
# config :user_agent_parser, key: :value | ||
# | ||
# And access this configuration in your application as: | ||
# | ||
# Application.get_env(:user_agent_parser, :key) | ||
# | ||
# Or configure a 3rd-party app: | ||
# | ||
# config :logger, level: :info | ||
# | ||
|
||
# It is also possible to import configuration files, relative to this | ||
# directory. For example, you can emulate configuration per environment | ||
# by uncommenting the line below and defining dev.exs, test.exs and such. | ||
# Configuration from the imported file will override the ones defined | ||
# here (which is why it is important to import them last). | ||
# | ||
# import_config "#{Mix.env}.exs" | ||
config :user_agent_parser, | ||
patterns: "./patterns.yml" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,19 +1,38 @@ | ||
defmodule UserAgentParser do | ||
@moduledoc """ | ||
""" | ||
|
||
use Application | ||
|
||
# See http://elixir-lang.org/docs/stable/elixir/Application.html | ||
# for more information on OTP Applications | ||
alias UserAgentParser.{Parser, Storage} | ||
|
||
@doc false | ||
def start(_type, _args) do | ||
import Supervisor.Spec, warn: false | ||
|
||
children = [ | ||
# Define workers and child supervisors to be supervised | ||
# worker(UserAgentParser.Worker, [arg1, arg2, arg3]), | ||
worker(Storage, []), | ||
] | ||
|
||
# See http://elixir-lang.org/docs/stable/elixir/Supervisor.html | ||
# for other strategies and supported options | ||
opts = [strategy: :one_for_one, name: UserAgentParser.Supervisor] | ||
Supervisor.start_link(children, opts) | ||
end | ||
|
||
@doc """ | ||
Parse a user-agent string into structs | ||
|
||
# Examples | ||
|
||
iex> agent_string = "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_7; en-us) AppleWebKit/530.17 (KHTML, like Gecko) Version/4.0 Safari/530.17 Skyfire/2.0" | ||
iex> user_agent = UserAgentParser.parse(agent_string) | ||
iex> to_string(user_agent) | ||
"Skyfire 2.0" | ||
iex> to_string(user_agent.os) | ||
"Mac OS X 10.5.7" | ||
iex> to_string(user_agent.device) | ||
"Other" | ||
""" | ||
def parse(user_agent), do: Parser.parse(pattern, user_agent) | ||
|
||
defp pattern, do: Storage.list | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
defmodule UserAgentParser.Device do | ||
@moduledoc """ | ||
Device struct and helper methods | ||
""" | ||
|
||
@doc """ | ||
# Examples | ||
|
||
iex> device = %UserAgentParser.Device{family: "iPhone"} | ||
iex> to_string(device) | ||
"iPhone" | ||
|
||
iex> device = %UserAgentParser.Device{} | ||
iex> to_string(device) | ||
"Other" | ||
""" | ||
defstruct [:family] | ||
end | ||
|
||
defimpl String.Chars, for: UserAgentParser.Device do | ||
def to_string(%{family: nil}), do: "Other" | ||
def to_string(%{family: family}), do: family | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
defmodule UserAgentParser.OperatingSystem do | ||
@moduledoc """ | ||
Operating System struct | ||
""" | ||
|
||
@doc """ | ||
# Examples | ||
|
||
iex> version = %UserAgentParser.Version{major: "1", minor: "2"} | ||
iex> os = %UserAgentParser.OperatingSystem{family: "macOS", version: version} | ||
iex> to_string(os) | ||
"macOS 1.2" | ||
|
||
iex> os = %UserAgentParser.OperatingSystem{family: "macOS"} | ||
iex> to_string(os) | ||
"macOS" | ||
""" | ||
defstruct [:family, :version] | ||
end | ||
|
||
defimpl String.Chars, for: UserAgentParser.OperatingSystem do | ||
def to_string(%{family: family, version: version}), do: String.trim("#{family} #{version}") | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
defmodule UserAgentParser.Parser do | ||
@moduledoc """ | ||
Handle parsing the user-agent string | ||
""" | ||
|
||
alias UserAgentParser.UserAgent, as: Agent | ||
alias UserAgentParser.Parsers.{Device, OperatingSystem, UserAgent} | ||
|
||
@doc """ | ||
Parse a user-agent string given a set of patterns | ||
""" | ||
def parse({ua_patterns, os_patterns, device_patterns}, user_agent) do | ||
user_agent | ||
|> sanitize | ||
|> parse_os(os_patterns) | ||
|> parse_device(device_patterns) | ||
|> parse_user_agent(ua_patterns) | ||
end | ||
|
||
defp find_and_parse(patterns, user_agent, module) do | ||
patterns | ||
|> search(user_agent) | ||
|> module.parse | ||
end | ||
|
||
defp match(nil, _string), do: nil | ||
defp match(group, string) do | ||
match = | ||
group | ||
|> Keyword.fetch!(:regex) | ||
|> Regex.run(string) | ||
|
||
{group, match} | ||
end | ||
|
||
defp parse_device({user_agent, acc}, patterns) do | ||
device = find_and_parse(patterns, user_agent, Device) | ||
{user_agent, Map.put(acc, :device, device)} | ||
end | ||
|
||
defp parse_os(user_agent, patterns) do | ||
os = find_and_parse(patterns, user_agent, OperatingSystem) | ||
{user_agent, %{os: os}} | ||
end | ||
|
||
defp parse_user_agent({user_agent, acc}, patterns) do | ||
patterns | ||
|> find_and_parse(user_agent, UserAgent) | ||
|> Map.merge(acc) | ||
end | ||
|
||
defp sanitize(user_agent), do: String.trim(user_agent) | ||
|
||
defp search(groups, string) do | ||
groups | ||
|> Enum.find(fn(group) -> | ||
group | ||
|> Keyword.fetch!(:regex) | ||
|> Regex.match?(string) | ||
end) | ||
|> match(string) | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
defmodule UserAgentParser.Parsers.Base do | ||
@moduledoc """ | ||
Base and behaviour for all of our parsers | ||
""" | ||
|
||
alias UserAgentParser.Parsers.Version, as: VersionParser | ||
|
||
@callback parse(args :: term) :: result :: term | nil | ||
|
||
def replace(nil, position, match), do: Enum.at(match, position) | ||
def replace(string, position, match) do | ||
val = Enum.at(match, position) | ||
String.replace(string, "$#{position}", val) | ||
end | ||
|
||
def parse_version(group, match, keys), | ||
do: VersionParser.parse({group, match}, keys) | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
defmodule UserAgentParser.Parsers.Device do | ||
alias UserAgentParser.{Device, Parsers.Base} | ||
|
||
import Base | ||
@behaviour Base | ||
|
||
def parse(nil), do: %Device{} | ||
def parse({group, match}) do | ||
family = Keyword.get(group, :device_replacement) | ||
|
||
family = | ||
match | ||
|> Enum.with_index | ||
|> Enum.reduce(family, fn({_, index}, acc) -> | ||
replace(acc, index, match) | ||
end) | ||
|
||
%Device{family: family} | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
defmodule UserAgentParser.Parsers.OperatingSystem do | ||
alias UserAgentParser.{OperatingSystem, Parsers.Base} | ||
|
||
import Base | ||
@behaviour Base | ||
|
||
@replacement_keys [:os_v1_replacement, | ||
:os_v2_replacement, | ||
:os_v3_replacement, | ||
:os_v4_replacement] | ||
|
||
def parse(nil), do: %OperatingSystem{} | ||
def parse({group, match}) do | ||
os = replace(group[:os_replacement], 1, match) | ||
|
||
match = Enum.slice(match, 1, 4) | ||
version = parse_version(group, match, @replacement_keys) | ||
|
||
%OperatingSystem{family: os, version: version} | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
defmodule UserAgentParser.Parsers.UserAgent do | ||
alias UserAgentParser.{UserAgent, Parsers.Base} | ||
|
||
import Base | ||
@behaviour Base | ||
|
||
@replacement_keys [:os_replacement, | ||
:os_replacement, | ||
:os_replacement, | ||
:os_replacement] | ||
|
||
def parse(nil), do: %UserAgent{} | ||
def parse({group, match}) do | ||
agent = replace(group[:family_replacement], 1, match) | ||
|
||
match = Enum.slice(match, 1, 4) | ||
version = parse_version(group, match, @replacement_keys) | ||
|
||
%UserAgent{family: agent, version: version} | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
defmodule UserAgentParser.Parsers.Version do | ||
alias UserAgentParser.{Version, Parsers.Base} | ||
|
||
import Base | ||
@behaviour Base | ||
|
||
def parse(nil), do: %Version{} | ||
def parse({group, match}, keys \\ []) do | ||
keys | ||
|> Enum.with_index | ||
|> Enum.map(fn({key, index}) -> | ||
group | ||
|> Keyword.get(key) | ||
|> replace(index + 1, match) | ||
end) | ||
|> version | ||
end | ||
|
||
defp version([major, minor, patch, patch_minor]), | ||
do: %Version{major: major, minor: minor, patch: patch, patch_minor: patch_minor} | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
defmodule UserAgentParser.Processor do | ||
@moduledoc """ | ||
Prepare a raw YAML document for consumption by the parser by | ||
converting charlists into strings and compiling our patterns. | ||
""" | ||
|
||
@doc """ | ||
Process a document into Elixir keyword lists and compiled patterns | ||
""" | ||
def process(document) do | ||
document | ||
|> extract | ||
|> convert | ||
|> compile | ||
end | ||
|
||
defp atom_key(key) do | ||
key | ||
|> String.Chars.to_string | ||
|> String.to_atom | ||
end | ||
|
||
defp compile(groups) do | ||
groups | ||
|> Enum.map(&compile_groups/1) | ||
|> to_tuple # result: {user_agents, os, devices} | ||
end | ||
|
||
defp compile_group(group) do | ||
pattern = | ||
group | ||
|> Keyword.fetch!(:regex) | ||
|> Regex.compile! | ||
|
||
Keyword.put(group, :regex, pattern) | ||
end | ||
|
||
defp compile_groups(groups), do: Enum.map(groups, &compile_group/1) | ||
|
||
defp convert([]), do: [] | ||
defp convert([head|tail]) do | ||
result = Enum.map(head, &to_keyword/1) | ||
[result|convert(tail)] | ||
end | ||
|
||
defp extract([document|_]) do | ||
[{'user_agent_parsers', user_agents}, {'os_parsers', os}, {'device_parsers', devices}] = document | ||
|
||
[user_agents, os, devices] | ||
end | ||
|
||
defp to_keyword([]), do: [] | ||
defp to_keyword([{key, value}|tails]) do | ||
keyword = {atom_key(key), String.Chars.to_string(value)} | ||
[keyword | to_keyword(tails)] | ||
end | ||
|
||
defp to_tuple(values, tuple \\ {}) | ||
defp to_tuple([], tuple), do: tuple | ||
defp to_tuple([head|tail], tuple) do | ||
tuple = Tuple.append(tuple, head) | ||
to_tuple(tail, tuple) | ||
end | ||
end |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometimes coming up with these docs feels like a chore 😛