Permalink
Commits on Apr 16, 2018
  1. Testing a change in the way we load CL+SSL.

    dimitri committed Apr 16, 2018
    Apparently cl+ssl needs to be reloaded a very specific way at image startup
    time, and provides a function to do just that. Let's try and use this piece
    of magic rather cffi:load-foreign-library directly.
Commits on Mar 27, 2018
  1. Code review for previous commit.

    dimitri committed Mar 27, 2018
    See #771.
Commits on Mar 26, 2018
  1. Typo fix in docs about concurrency settings.

    dimitri committed Mar 26, 2018
Commits on Mar 16, 2018
  1. Implement support for MySQL useSSL=true|false option.

    dimitri committed Mar 16, 2018
    The MySQL connection string parameter for SSL usage is useSSL, so map an
    option name to our expected values for sslmode in database connection
    strings.
    
    See #748.
Commits on Mar 7, 2018
  1. Fix date-with-no-separator transform.

    dimitri committed Mar 7, 2018
    The expected string length was hard-coded, which is not a good idea given
    the support for custom date formats.
Commits on Feb 25, 2018
  1. DB3: pick user's choice of schema name when given.

    dimitri committed Feb 25, 2018
    We would hard-code the schema name into the table's name in the DB3 case on
    the grounds that a db3/dbf file doesn't have a notion of a schema. But when
    the user wants to add data into an existing target table, then we merge the
    catalogs and must keep the given target schema and table name.
    
    Fix #701.
Commits on Feb 24, 2018
  1. Handle parsing errors in pgpass gracefully.

    dimitri committed Feb 24, 2018
    Accept empty password lines in ~/.pgpass files, and when otherwise pgloader
    fails to parse or process the file log a warning and return a nil password.
    
    See #748.
  2. Review Dockerfiles.

    dimitri committed Feb 24, 2018
    Upgrade to stretch in the docker builds and improve disk footprint to some
    degree, using classic docker tricks.
    
    See #748.
  3. Fix duplicate package names.

    dimitri committed Feb 24, 2018
    In a previous commit we re-used the package name pgloader.copy for the now
    separated implementation of the COPY protocol, but this package was already
    in use for the implementation of the COPY file format as a pgloader source.
    
    Oops.
    
    And CCL was happily doing its magic anyway, so that I've been blind to the
    problem.
    
    To fix, rename the new package pgloader.pgcopy, and to avoid having to deal
    with other problems of the same kind in the future, rename every source
    package pgloader.source.<format>, so that we now have pgloader.source.copy
    and pgloader.pgcopy, two visibily different packages to deal with.
    
    This light refactoring came with a challenge tho. The split in between the
    pgloader.sources API and the rest of the code involved some circular
    depencendies in the namespaces. CL is pretty flexible here because it can
    reload code definitions at runtime, but it was still a mess. To untangle it,
    implement a new namespace, the pgloader.load package, where we can use the
    pgloader.sources API and the pgloader.connection and pgloader.pgsql APIs
    too.
    
    A little problem gave birth to quite a massive patch. As it happens when
    refactoring and cleaning-up the dirt in any large enough project, right?
    
    See #748.
Commits on Feb 20, 2018
  1. Add a new test case for {{ENVVAR}} template support.

    dimitri committed Feb 20, 2018
    See #555.
Commits on Feb 19, 2018
  1. Fix implementation of foreign keys in data only mode.

    dimitri committed Feb 19, 2018
    In data-only mode, the foreign keys parameter (which defaults to True) means
    something special: we remove the fkey definitions prior to the data only
    load then re-install the fkeys.
    
    This got broken in a previous commit, the WITH clause option being processed
    like the other DDL ones that only make sense when creating the schema. While
    fixing the setting in copy-database, we have to also fix a nesting bug in
    complete-pgsql-database that would prevent fkey to be installed again at the
    end of the load.
    
    This patch not only fix that choice, but also review the implementation of
    the drop-pgsql-fkeys support function to use more modern internal API,
    preparing a list of SQL statements to be sent to the psql-execute level.
    
    Fixes #745.
  2. Improve summary reporting of errors.

    dimitri committed Feb 19, 2018
    Not all error paths are counted correctly at this point, this commit
    improves the situation in passing. A thorough review should probably be
    planned sometime.
Commits on Feb 16, 2018
  1. Fix support for newid() from MS SQL.

    dimitri committed Feb 16, 2018
    Several places in the code are involved to deal with the default values from
    MS SQL. The catalog query is dealing with strange quoting rules on the
    source side and used to fill in directly the PostgreSQL expected value. But
    then the quoting of a function call wasn't properly handled.
    
    Rather than coping with the quoting rules here, have the catalog query
    return a pgloader specific placeholder "GENERATE_UUID". Then the MS SQL
    specific code can normalize that to the symbol :generate_uuid. Then the
    generic PostgreSQL DDL code can implement the proper replacement for that
    symbol, not having to know where it comes from.
    
    Fix #742.
  2. Some improvements on the GitHub issue template.

    dimitri committed Feb 16, 2018
    Well, let's be more direct to the user.
  3. Add a GitHub issue template.

    dimitri committed Feb 16, 2018
  4. When merging catalogs, "float" and "double precision" the same type.

    dimitri committed Feb 16, 2018
    PostgreSQL understands both spellings of the data type name and implements
    float as being a double precision value, so we should refrain from any
    warning about that non-discrepency when doing a data-only load.
    
    Should fix #746.
  5. Fix SQLite SQL queries.

    dimitri committed Feb 16, 2018
    Some copy-paste errors made their way to those queries and prevented usage
    of pgloader, but I missed that because I was using a previous version of the
    query text files in my interactive environment.
    
    Also, SQLite doesn't like the queries finishing with a semi-colon, so remove
    them.
    
    Fixes #747.
Commits on Feb 8, 2018
  1. Fix "drop default" casting rules for all databases.

    dimitri committed Feb 8, 2018
    The support for drop default in (user defined) casting rules was completely
    broken in SQLite, because the code didn't even bother looking at what's
    returning after applying the casting rules.
    
    This patch fixes the code so that is uses the pgcol instance's default
    value, as per after applying casting rules. The bug also existed in a subtle
    form for MySQL and MS SQL, but would only show up there when the default
    value is spelled using a known variation of “current timestamp”.
  2. Assorted fixes for SQLite.

    dimitri committed Feb 8, 2018
    First review the `sqlite_sequence` support so that we can still work with
    databases that don't have this catalog, which doesn't always exists -- it
    might depend on the SQLite version though.
    
    Then while at it use the sql macro to host the SQLite “queries” in their own
    files, enhancing the hackability of the system to some degrees. Not that
    much, because we have to use a lot of PGRAMA command and then the column
    output isn't documented with the query text itself.
Commits on Feb 7, 2018
  1. Implement SQLite casting rule for “decimal”.

    dimitri committed Feb 7, 2018
    Fix #739.
Commits on Jan 31, 2018
  1. Fix SQLite processing of columns with a sequence attached.

    dimitri committed Jan 31, 2018
    The handling of the SQLite catalogs where fixed in a previous patch, but
    either it's been broken in between or it never actually worked (oops).
    
    Moreover, the recent patch about :on-update-current-timestamp changed the
    casting rules matching code and we should position :auto-increment from the
    SQLite module rather than "auto_increment" as before. That's better, but
    wasn't done.
    
    Fix #563 again, tested with a provided test-case (thanks!).
  2. Implement support for new casting rules guards and actions.

    dimitri committed Jan 31, 2018
    Namely the actions are “keep extra” and “drop extra” and the casting rule
    guard is “with extra on update current timestamp”. Having support for those
    elements in the casting rules allow such a definition as the following:
    
          type timestamp with extra on update current timestamp
            to "timestamp with time zone" drop extra
    
    The effect of such as cast rule would be to ignore the MySQL extra
    definition and then refrain pgloader from creating the PostgreSQL triggers
    that implement the same behavior.
    
    Fix #735.
Commits on Jan 25, 2018
  1. Don't push-row a nil value.

    dimitri committed Jan 25, 2018
    In case of a failure to pre-process or transform values in the row that as
    been read, we need to refrain from pushing the row into our next batch.
    
    See #726, that got hit by the recent bug in the middle of something else
    entirely.
  2. Add a restart-case for interactive debugging.

    dimitri committed Jan 25, 2018
    When dealing with MATERIALIZING VIEWS test cases and failing in the middle
    of them, as it happens when fixing bugs, then it was tedious (to say the
    least) to clean-up manually the view each time.
    
    That said, for end-users, doing it automatically would risk cleaning-up the
    wrong view definition if they had a typo in their pgloader command, say.
    
    Common Lisp helps a lot here: we simply create a restart that is only
    available interactively for the developers of pgloader!
  3. Refrain from creating tables in “data only” operations.

    dimitri committed Jan 25, 2018
    We forgot that rule in the case of creating the target tables for the
    materializing views commands, which led to surprising and wrong behavior.
    
    Fix #721, and add a new test case while at it.
  4. Review misleading error message with schema not found.

    dimitri committed Jan 25, 2018
    It might be that the schema exists but we didn't find what we expected to
    in there, so that it didn't make it to pgloader's internal catalogs. Be
    friendly to the user with a better error message.
    
    Fix #713.
Commits on Jan 24, 2018
  1. Step back on (safety 0) optimization.

    dimitri committed Jan 24, 2018
    It doesn't appear worth it at this time yet, too risky.
  2. Docs cleanup.

    dimitri committed Jan 24, 2018
    Don't maintain generated files in git, it's useless (thanks mainly to
    readthedocs), also remove the previous format of the docs.
  3. Review the pgloader COPY implementation further.

    dimitri committed Jan 24, 2018
    Refactor file organisation further to allow for adding a “direct stream”
    option when the on-error-stop behavior has been selected. This happens
    currently by default for databases sources.
    
    Introduce the new WITH option “on error resume next” which forces the
    classic behavior of pgloader. The option “on error stop” already existed,
    its implementation is new.
    
    When this new behavior is activated, the data is sent to PostgreSQL
    directly, without intermediate batches being built. It means that the whole
    operation fails at the first error, and we don't have any information in
    memory to try replaying any COPY of the data. It's gone.
    
    This behavior should be fine for database migrations as you don't usually
    want to fix the data manually in intermediate files, you want to fix the
    problem at the source database and do the whole dance all-over again, up
    until your casting rules are perfect.
    
    This patch might also incurr some performance benenits in terms of both
    timing and memory usage, though the local testing didn't show much of
    anything for the moment.
Commits on Jan 23, 2018
  1. Simplify format-vector-row a lot.

    dimitri committed Jan 23, 2018
    Copy some code over from cl-postgres-trivial-utf-8 and add the support for
    PostgreSQL COPY escaping right at the same place, allowing to allocate our
    formatted utf-8 buffer only once, with the escaping already installed.
    
    This patch was expected to be more about perfs, but it's actually only about
    code cleaning it seems, as it doesn't make a big difference in the testing I
    could do here.
    
    That said, getting rid of one intermediate buffer should be nice in terms of
    memory management.
  2. Clean up source code organisation.

    dimitri committed Jan 23, 2018
    The copy format and batch facilities are no longer the meat of your
    PostgreSQL support in the src/pgsql directory, so have them leave in their
    own space.
Commits on Jan 22, 2018
  1. Review format-vector-row.

    dimitri committed Jan 22, 2018
    This function prepares the data to be sent down to PostgreSQL as a clean
    COPY text with unicode handled correctly. This commit is mainly a clean-up
    of the function, and also adds some smarts to try and make it faster.
    
    In testing, the function is now tangentially faster than before, but not by
    much. The hope here is that it's now easier to optimize it.
  2. Add support for the newer Qmynd error handling.

    dimitri committed Jan 22, 2018
    We now have a qmynd-impl::decoding-error condition to deal with, which as a
    very good error reporting, so that we don't need to poke into babel details
    anymore. The error message adds the column name, type and collation to the
    output, too.
    
    We keep the babel handlers for a while until people have all migrated to
    using the patch in qmynd.
    
    With the Fix to Qmynd, Fix #716.
Commits on Jan 14, 2018
  1. Fix CSV separator parsing.

    dimitri committed Jan 14, 2018
    The previous patch introduced parser conflicts and we couldn't parse some
    expressions any more, such as the following:
    
            fields escaped by '\',
    
    It's now possible to represent single quote as either '''', '\'', or '0x27'
    and we still can parse '\' as being a single backslash character.
    
    See #705.