-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize batch encoder #250
base: devel
Are you sure you want to change the base?
Conversation
Optimization is done by creation of "row encoder" structure that can be used to encode one whole row at a time - no need to lookup OID tables every time. Also, some edoc improvements.
This structure could be reused to encode similar rows without doing OID database lookups for each element
@@ -48,6 +48,7 @@ init(Batch) when is_list(Batch) -> | |||
#batch{batch = Batch}. | |||
|
|||
execute(Sock, #batch{batch = Batch, statement = undefined} = State) -> | |||
%% Each query has it's own statement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Linguistic nitpick: "its", not "it's". "It's" == "it is"
I'm not merging yet, because I plan to wait for my colleagues to do some benchmarks before. |
Ok, we tried to import some multi-terabyte dump using both "current" and "optimized" version, but haven't noticed any difference, primarily because the import happened to be disk-bound 😄 . At least, it haven't made anything worse. |
Is this ready to merge? |
Before the only way to encode Erlang term to PostgreSQL binary format was to use
epgsql_wire:encode_parameters
that callsepgsql_binary:encode/3
in a loop. The problem is thatepgsql_binary:encode/3
does quite a lot ofcodec()
database lookups and structure transformations and this process repeats for each parameter.It worked quite ok for normal
equery
, because you usually don't send too many parameters to it. But now with introduction ofCOPY .. FROM STDIN WITH (FORMAT binary)
we might need to encode a lot of rows of the same structure. So it makes sense to do all the OID table lookups in advance.The same optimization was applied to
execute_batch/3
, because we use the same statement for multiple rows.1st commit optimizes COPY, 2nd optimizes
execute_batch