Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wip/add copy in #1

Merged
merged 2 commits into from
Mar 18, 2022
Merged

Wip/add copy in #1

merged 2 commits into from
Mar 18, 2022

Conversation

ArjanSchouten
Copy link

Make sure that:

  • You have read the contribution guidelines.
  • You have created a feature request first to discuss your contribution intent. Please reference the feature request ticket number in the pull request.
  • You use the code formatters provided here and have them applied to your changes. Don't submit any formatting related changes.
  • You submit test cases (unit or integration tests) that back your changes.

Issue description

New Public APIs

Additional context

Copy link

@marcel-cgi marcel-cgi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ik heb PostgresqlCopyIn nog niet in depth bekeken, maar dit is wat ik tot nu toe zie.

De integratie test lijken me zo volledig, elke test case die ik verzin is al gecovered.


PostgresqlConnection connection = createConnection(client, MockCodecs.empty(), this.statementCache);

connection.copyIn("COPY IN", Flux.empty())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ik snap niet helemaal het doel van deze test class. Is het ook mogelijk om hier een test te doen met een soort mock responses, waarmee je test dat de PostgresqlCopyIn echt aangeroepen wordt?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doel is om invloed te hebben op wat postgres terug zou kunnen sturen in theorie. Het is met een integration test bijv lastig om te simuleren dat de database halverwege een operatie de connection sluit. In dit geval is het misschien wat dubbelop maar ik vind dat je sowieso de unit testen moet schrijven.

src/main/java/io/r2dbc/postgresql/PostgresqlCopyIn.java Outdated Show resolved Hide resolved
src/main/java/io/r2dbc/postgresql/PostgresqlCopyIn.java Outdated Show resolved Hide resolved
src/main/java/io/r2dbc/postgresql/PostgresqlCopyIn.java Outdated Show resolved Hide resolved
This commit contains the happy flow copy in functionality without much testing.
This should be seen as a design proposal for the r2dbc-postgresql team.
If r2dbc-postgresql team agree with this proposed solution we can adress all testing requirements and fix all TODO comments.
@ArjanSchouten ArjanSchouten marked this pull request as ready for review March 17, 2022 12:24
Copy link

@marcel-cgi marcel-cgi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ziet er goed uit. Cool die unit test zo idd.
Zag nog paar mini dingetjes.

src/main/java/io/r2dbc/postgresql/PostgresqlCopyIn.java Outdated Show resolved Hide resolved
* @param sql the COPY sql statement
* @param stdin the ByteBuffer publisher
* @return a {@link Mono} with the amount of rows inserted
*/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moet er niet een @since bij?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ik weet nog niet in welke versie dit komt en of het ook in de 0.8.x releases gaat komen of dat we moeten wachten op de nieuwe spring-data releases?!

Voor nu ff laten zodat het de verantwoordelijkheid is van r2dbc team?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okido

Flux<ByteBuffer> input = DataBufferUtils.read(connectionHolder.csvFile, DefaultDataBufferFactory.sharedInstance, bufferSize, StandardOpenOption.READ)
.map(DataBuffer::asByteBuffer);

Long rowsInserted = connectionHolder.r2dbc.copyIn("COPY simple_test (name, age) FROM STDIN DELIMITER ';'", input)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ik krijg geen Rule 1.3 errors meer. Wel vreemd dat jdbc meer dat 2 keer zo snel is:

Benchmark                      (rows)   Mode  Cnt     Score     Error  Units
CopyInBenchmarks.copyInJdbc         0  thrpt    5  6464.981 ± 1673.001  ops/s
CopyInBenchmarks.copyInJdbc         1  thrpt    5  6083.269 ± 1046.740  ops/s
CopyInBenchmarks.copyInJdbc       100  thrpt    5  3822.937 ±  443.608  ops/s
CopyInBenchmarks.copyInJdbc      1000  thrpt    5   395.994 ±  87.967  ops/s
CopyInBenchmarks.copyInJdbc     10000  thrpt    5    97.100 ±  17.249  ops/s
CopyInBenchmarks.copyInJdbc   1000000  thrpt    5     1.076 ±   1.220  ops/s
CopyInBenchmarks.copyInR2dbc        0  thrpt    5  4280.554 ±  870.727  ops/s
CopyInBenchmarks.copyInR2dbc        1  thrpt    5   559.015 ±   77.062  ops/s
CopyInBenchmarks.copyInR2dbc      100  thrpt    5   515.306 ±  112.103  ops/s
CopyInBenchmarks.copyInR2dbc     1000  thrpt    5   122.167 ±  53.674  ops/s
CopyInBenchmarks.copyInR2dbc    10000  thrpt    5    47.122 ±   6.171  ops/s
CopyInBenchmarks.copyInR2dbc  1000000  thrpt    5     0.856 ±   0.710  ops/s

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Goed om te horen van die Rule 1.3. Nog geen idee door wat dat dan veroorzaakt werd.

Ik denk dat we ons er niet op moeten blindstaren. Ik had het idee dat het wellicht kan liggen aan het lezen van file waarbij FileInputStream sneller is. Testje gedaan en dit is bij mij lokaal de output:

Benchmark                      (rows)   Mode  Cnt      Score       Error  Units
CopyInBenchmarks.copyInJdbc         0  thrpt    5  35548.527 ± 14809.055  ops/s
CopyInBenchmarks.copyInJdbc         1  thrpt    5  34664.394 ±  8705.782  ops/s
CopyInBenchmarks.copyInJdbc       100  thrpt    5  14261.899 ±  9395.174  ops/s
CopyInBenchmarks.copyInJdbc   1000000  thrpt    5      2.595 ±     0.240  ops/s
CopyInBenchmarks.copyInR2dbc        0  thrpt    5  28322.464 ±  2161.058  ops/s
CopyInBenchmarks.copyInR2dbc        1  thrpt    5  20449.850 ±   617.851  ops/s
CopyInBenchmarks.copyInR2dbc      100  thrpt    5  20647.997 ±  1632.350  ops/s
CopyInBenchmarks.copyInR2dbc  1000000  thrpt    5    116.070 ±     6.872  ops/s

Met de volgende code:

    @Benchmark
    public void copyInR2dbc(ConnectionHolder connectionHolder, Blackhole voodoo) {
        int bufferSize = 65536; // BufferSize is the same as the one from JDBC's CopyManager
        Long rowsInserted = DataBufferUtils.read(connectionHolder.csvFile, DefaultDataBufferFactory.sharedInstance, bufferSize, StandardOpenOption.READ)
            .map(x -> {
                DataBufferUtils.release(x);
                return 1;
            })
            .count()
            .block();

        voodoo.consume(rowsInserted);
    }

    @Benchmark
    public void copyInJdbc(ConnectionHolder connectionHolder, Blackhole voodoo) throws IOException, SQLException {
        try (InputStream inputStream = new FileInputStream(connectionHolder.csvFile.toFile())) {

            int i = 0;
            Scanner s = new Scanner(inputStream).useDelimiter("\n");
            while (s.hasNext()) {
                i++;
                voodoo.consume(s.next());
            }
            
            voodoo.consume(i);
        }
    }

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heb jij een idee om de r2dbc variant te verbeteren? Dat is namelijk wel chique richting r2dbc om cijfers te hebben die representatief zijn...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nee, veel efficienter lijkt het me niet te krijgen dan in die benchmark. Ik weet alleen niet of 1 en 100 zulke goede benchmark cases zijn.

src/main/java/io/r2dbc/postgresql/PostgresqlCopyIn.java Outdated Show resolved Hide resolved
@ArjanSchouten ArjanSchouten merged commit dc0be6f into main Mar 18, 2022
@ArjanSchouten ArjanSchouten deleted the WIP/add-copy-in branch March 18, 2022 08:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants