Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize batch encoder #250

Open
wants to merge 2 commits into
base: devel
Choose a base branch
from

Conversation

seriyps
Copy link
Member

@seriyps seriyps commented Jan 15, 2021

Before the only way to encode Erlang term to PostgreSQL binary format was to use epgsql_wire:encode_parameters that calls epgsql_binary:encode/3 in a loop. The problem is that epgsql_binary:encode/3 does quite a lot of codec() database lookups and structure transformations and this process repeats for each parameter.
It worked quite ok for normal equery, because you usually don't send too many parameters to it. But now with introduction of COPY .. FROM STDIN WITH (FORMAT binary) we might need to encode a lot of rows of the same structure. So it makes sense to do all the OID table lookups in advance.
The same optimization was applied to execute_batch/3, because we use the same statement for multiple rows.

1st commit optimizes COPY, 2nd optimizes execute_batch

Optimization is done by creation of "row encoder" structure that
can be used to encode one whole row at a time - no need to lookup OID tables
every time.

Also, some edoc improvements.
This structure could be reused to encode similar rows without doing
OID database lookups for each element
@@ -48,6 +48,7 @@ init(Batch) when is_list(Batch) ->
#batch{batch = Batch}.

execute(Sock, #batch{batch = Batch, statement = undefined} = State) ->
%% Each query has it's own statement
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linguistic nitpick: "its", not "it's". "It's" == "it is"

@seriyps
Copy link
Member Author

seriyps commented Jan 21, 2021

I'm not merging yet, because I plan to wait for my colleagues to do some benchmarks before.

@seriyps
Copy link
Member Author

seriyps commented Feb 1, 2021

Ok, we tried to import some multi-terabyte dump using both "current" and "optimized" version, but haven't noticed any difference, primarily because the import happened to be disk-bound 😄 . At least, it haven't made anything worse.
I'll either try to add some microbenchmarks or merge it as is bit later.

@daironm-hillrom
Copy link

Is this ready to merge?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants