Skip to content

Commit

Permalink
Boost Attribute Speed and Address Security Issue
Browse files Browse the repository at this point in the history
 ## SECURITY ISSUE

The code was taking data, which in many cases could be externally
submitted, and was generating atoms from it to filter attribute
keys down.  This is a problem because atoms are not garbage
collected and only a finite amount can exist.  An attacker can
craft loads of query strings which would eventually cripple a
running server.  After processing the following request as
suggested by the README this would generate 7 potential new atoms.

**EXAMPLE:** `/widgets?fields[widget]=a,aa,aaa,aaaa,ab,aba,abb`

This has been corrected by using `String.to_existing_atom/1`.  It
should be noted that there is a chance this can break existing
clients that have some kind of dynamic attribute system in which
keys aren't represented as atoms before this filter applies.

 ## BOOST BREAKDOWN

The Attribute module has the responsibility of delegating data to
a serializer module, filtering any optional fields given as opts
and then wrapping them in an `Attribute` structure that can later
be used for custom formatting.

The most expensive part of this process is wrapping the returned
values in a `Attribute` structure, and this part is not avoidable.
However, there was a huge opportunity for a speed increase when
looking at how the filtering works.  It splits strings, turns them
into atoms, then filters data based on that list.  Doing this for
potentially every item in a collection becomes an N+1 problem, and
can be avoided by doing this work only once per request.

This work does not provide a solution for ahead-of-time filtering;
however, it will allow for it in the future when it becomes
available.  When a filter list is given a list of pre-calculated
atoms the speed increase for each calculation goes up by about 240%
for a small trivial list of fields to filter.  This speed increase
only gets bigger as the list of fields to filter grows.

The most notable changes to the code are:

 * fast out filtering when no opts are present
 * opting for `Map.take` over custom map filtering code
 * only generating list of filter keys if it is needed

 ## ORIGINAL BENCH

```
 ## Builder.AttributeBench
 optimized opts for correct serializer small data  -- not supported --
 optimized opts for correct serializer big data    -- not supported --
 field opts for wrong serializer small data        1000000   1.09 µs/op
 no field opts to process small data               1000000   1.21 µs/op
 no opts to process small data                     1000000   1.22 µs/op
 no opts to process big data                       1000000   2.60 µs/op
 no field opts to process big data                 1000000   2.85 µs/op
 field opts for wrong serializer big data          1000000   2.86 µs/op
 fields opts for correct serializer big data        500000   3.43 µs/op
 fields opts for correct serializer small data      500000   3.44 µs/op
```

 ## AFTER BOOST

```
 ## Builder.AttributeBench
 no opts to process small data                       10000000   0.93 µs/op
 optimized opts for correct serializer small data    10000000   0.94 µs/op
 no field opts to process small data                  1000000   1.01 µs/op
 optimized opts for correct serializer big data       1000000   1.07 µs/op
 field opts for wrong serializer small data           1000000   1.10 µs/op
 no opts to process big data                          1000000   2.34 µs/op
 no field opts to process big data                    1000000   2.42 µs/op
 field opts for wrong serializer big data             1000000   2.84 µs/op
 fields opts for correct serializer big data           500000   3.23 µs/op
 fields opts for correct serializer small data         500000   3.23 µs/op
```

 ## SPEED INCREASE BREAKDOWN

 * filtering with pre-calculated filter-list **240%++ faster**
 * worst case senario (non-calculated filter list) **6% faster**
 * no field filters present **17% faster**
 * filters present, but not for correct serializer **no speed increase**
  • Loading branch information
benfalk committed May 26, 2016
1 parent d91413b commit c9bcf96
Show file tree
Hide file tree
Showing 3 changed files with 114 additions and 13 deletions.
62 changes: 62 additions & 0 deletions bench/builder/attribute_bench.exs
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
defmodule Builder.AttributeBench do
use Benchfella

@big 1..20
|> Enum.map(&{String.to_atom("key_#{&1}"), &1})
|> Enum.into(%{})

@small 1..5
|> Enum.map(&{String.to_atom("key_#{&1}"), &1})
|> Enum.into(%{})

bench "no opts to process small data",
[context: data_no_opts(@small)],
do: JaSerializer.Builder.Attribute.build(context)

bench "no opts to process big data",
[context: data_no_opts(@big)],
do: JaSerializer.Builder.Attribute.build(context)

bench "no field opts to process small data",
[context: data_opts(@small)],
do: JaSerializer.Builder.Attribute.build(context)

bench "no field opts to process big data",
[context: data_opts(@big)],
do: JaSerializer.Builder.Attribute.build(context)

bench "field opts for wrong serializer small data",
[context: data_opts(@small, fields: %{"seabass"=>"fin,scale"})],
do: JaSerializer.Builder.Attribute.build(context)

bench "field opts for wrong serializer big data",
[context: data_opts(@big, fields: %{"seabass"=>"fin,scale"})],
do: JaSerializer.Builder.Attribute.build(context)

bench "fields opts for correct serializer small data",
[context: data_opts(@small, fields: %{"widget"=>"key_1,key_5"})],
do: JaSerializer.Builder.Attribute.build(context)

bench "fields opts for correct serializer big data",
[context: data_opts(@small, fields: %{"widget"=>"key_1,key_5"})],
do: JaSerializer.Builder.Attribute.build(context)

bench "optimized opts for correct serializer small data",
[context: data_opts(@small, fields: %{"widget"=>[:key_2, :key_8]})],
do: JaSerializer.Builder.Attribute.build(context)

bench "optimized opts for correct serializer big data",
[context: data_opts(@big, fields: %{"widget"=>[:key_2, :key_8]})],
do: JaSerializer.Builder.Attribute.build(context)

defmodule Serializer do
def type, do: "widget"
def attributes(data, _), do: data
end

defp data_no_opts(data),
do: %{data: data, serializer: Serializer, conn: nil}

defp data_opts(data, opts \\ []),
do: %{data: data, serializer: Serializer, conn: nil, opts: opts}
end
34 changes: 21 additions & 13 deletions lib/ja_serializer/builder/attribute.ex
Original file line number Diff line number Diff line change
Expand Up @@ -4,24 +4,32 @@ defmodule JaSerializer.Builder.Attribute do
defstruct [:key, :value]

def build(context) do
context
|> fields_to_include
attributes(context)
|> filter_fields(context)
|> Enum.map(&do_build/1)
end

def fields_to_include(%{data: data, serializer: serializer, conn: conn} = context) do
attrs = apply(serializer,:attributes, [data, conn])
field_map = context[:opts][:fields] || %{}
attrs_to_include = Map.get(field_map, serializer.type)
defp attributes(%{serializer: serializer, data: data, conn: conn}) do
serializer.attributes(data, conn)
end

cond do
attrs_to_include ->
include_list = String.split(attrs_to_include, ",") |> Enum.map(&String.to_atom/1)
Enum.filter attrs, fn({key, _value}) -> key in include_list end
true ->
attrs
defp filter_fields(attrs, %{serializer: serializer, opts: opts}) do
case opts[:fields] do
fields when is_map(fields) -> do_filter(attrs, fields[serializer.type])
_any -> attrs
end
end
defp filter_fields(attrs, _), do: attrs

defp do_filter(attrs, nil), do: attrs
defp do_filter(attrs, fields) when is_list(fields),
do: Map.take(attrs, fields)
defp do_filter(attrs, fields) when is_binary(fields),
do: do_filter(attrs, safe_atom_list(fields))

defp safe_atom_list(field_str) do
field_str |> String.split(",") |> Enum.map(&String.to_existing_atom/1)
end

def do_build({key, value}), do: %__MODULE__{key: key, value: value}
defp do_build({key, value}), do: %__MODULE__{key: key, value: value}
end
31 changes: 31 additions & 0 deletions test/ja_serializer/formatter/attribute_test.exs
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,11 @@ defmodule JaSerializer.Formatter.AttributeTest do
defstruct [:foo, :bar]
end

defmodule SimpleSerializer do
def type, do: "simple"
def attributes(data,_), do: data
end

defimpl JaSerializer.Formatter, for: [Example, Map] do
def format(%{foo: foo, bar: bar}), do: [foo, bar] |> Enum.join("")
def format(%{} = map), do: map
Expand All @@ -27,4 +32,30 @@ defmodule JaSerializer.Formatter.AttributeTest do

assert {"example", "foobar"} == results
end

test "the correct keys are filtered out with build" do
context = %{data: %{key_1: 1, key_2: 2, key_3: 3},
serializer: SimpleSerializer,
conn: nil,
opts: [fields: %{"simple"=>"key_2,key_3"}]}

result = @attr.build(context)

refute :key_1 in Enum.map(result, &(&1.key))
assert :key_2 in Enum.map(result, &(&1.key))
assert :key_3 in Enum.map(result, &(&1.key))
end

test "the correct keys are are filtered when given a list" do
context = %{data: %{key_1: 1, key_2: 2, key_3: 3},
serializer: SimpleSerializer,
conn: nil,
opts: [fields: %{"simple"=>[:key_2, :key_3]}]}

result = @attr.build(context)

refute :key_1 in Enum.map(result, &(&1.key))
assert :key_2 in Enum.map(result, &(&1.key))
assert :key_3 in Enum.map(result, &(&1.key))
end
end

0 comments on commit c9bcf96

Please sign in to comment.