Commits on Oct 25, 2016
  1. Fix compilation error.

    Introduced recently when refactoring the match rules, forgot to update
    all call sites, and the bug went unnoticed for a while, oops. Not sure
    the fix is all we need to get back a working feature (alter schema
    rename to), but it allows to compile and that's all I have the time to
    handle today.
    See #466.
    committed Oct 25, 2016
Commits on Oct 2, 2016
  1. Typo fix...

    committed Oct 2, 2016
  2. Add iwoca as a sponsor to pgloader.

    Thanks guys ;-)
    committed Oct 2, 2016
Commits on Sep 17, 2016
  1. Review identifier case :quote.

    We added some confution about who's responsible to quote the SQL obejct
    names in between src/utils/quoting.lisp and src/pgsql/pgsql-ddl.lisp and
    as a result some migrations from MySQL with identifier case set to quote
    where broken, as in #439.
    To fix, remove any use of the format directive ~s in the PostgreSQL ddl
    output methods: we consider that the quoting of ~s is to be decided in
    apply-identifier-case. We then use ~a instead of ~s.
    Fix #439.
    committed Sep 17, 2016
Commits on Sep 10, 2016
  1. Improve INCLUDING rule matching for MySQL.

    In the MySQL source we have explicit support for both string equality
    and regexps for the INCLUDING and EXCLUDING clauses. This got broken
    when moved to be shared with the ALTER TABLE implementation, because
    we were no longer using the type system in the same way in all places.
    To fix, create new abstractions for strings and regexps and use those
    new structs in the proper way (thanks to defstruct and CLOS).
    Fixes #441.
    committed Sep 10, 2016
  2. Implement and use DROP ... IF EXISTS.

    In cases where we have a WITH include drop option, we are generating
    lots of SQL DROP statements. We may be running an empty target database
    or in other situations where the target object of the DROP command might
    not exists. Add support for that case.
    committed Sep 10, 2016
Commits on Sep 2, 2016
  1. Add stats about how many files we processed.

    In the FILENAME MATCHING case it might be good to have the information,
    which can also explain some of the timing spent. The example in
    test/bossa.load currently loads data from 296 files total...
    committed Sep 2, 2016
Commits on Aug 30, 2016
  1. Fix error reporting of catalogs.

    The internal catalog representation are deeply recursive in order to
    make it easy to traverse the catalog both downwards (catalog to schema
    to tables) and upward (table to its schema to its catalog).
    In consequence we need to set *print-circles* to non-nil when we're
    going to log the catalogs, so turn it to non-nil before generating the
    log messages.
    While at it, add logging of such catalogs in the :data log verbosity
    mode. The catalog output is very verbose, but it's easy to copy/paste it
    from a bug report into being a live object we can inspect in the REPL,
    thanks to Common Lisp notion of a reader and readable printer!
    committed Aug 30, 2016
  2. Back to development mode, not a release anymore.

    The next version is going to be either 3.3.2 or depending on
    whether we have mainly bug fixes or new features.
    committed Aug 30, 2016
Commits on Aug 28, 2016
  1. Release pgloader v3.3.1.

    committed Aug 28, 2016
  2. Fix stats collections in some cases.

    Calling a -with-timing from within a with-stats-collection macro is
    redundant and will have the numbers counted twice. Which in this case
    didn't happen because the stats label was manually copied, but borked
    with a typo in one copy.
    committed Aug 28, 2016
  3. Update an old archive test case.

    committed Aug 28, 2016
Commits on Aug 19, 2016
  1. Rename web/ into docs/

    This allows to benefit from github pages without having to maintain a
    separate orphaned branch.
    committed Aug 19, 2016
Commits on Aug 10, 2016
  1. Improve existing PostgreSQL database handling.

    When loading data into an existing PostgreSQL catalog, we DROP the
    indexes for better performance of the data loading. Some of the indexes
    are UNIQUE or even PRIMARY KEYS, and some FOREIGN KEYS might depend on
    them in the PostgreSQL dependency tracking of the catalog.
    We used to use the CASCADE option when dropping the indexes, which hides
    a bug: if we exclude from the load tables with foreign keys pointing to
    tables we target, then we would DROP those foreign keys because of the
    CASCADE option, but fail to install them again at the end of the load.
    To prevent that from happening, pgloader now query the PostgreSQL
    pg_depend system catalog to list the “missing” foreign keys and add them
    to our internal catalog representation, from which we know to DROP then
    CREATE the SQL object at the proper times.
    See #400 as this was an oversight in fixing this issue.
    committed Aug 10, 2016
Commits on Aug 8, 2016
Commits on Aug 7, 2016
  1. Fix foreign key definition formatting.

    When we do have a condef (constraint definition in the PostgreSQL
    catalog slang), use it rather than trying to invent it again from the
    bits and pieces. See #400, which it actually fixes now...
    committed Aug 7, 2016
  2. Fix typo: Performance, singular.

    Fixed #432.
    committed Aug 7, 2016
  3. Improve pgloader bundle distribution.

    Include the local git clones in the bundle so that git is not needed at
    build time for consumers of the bundle. Fixes #428.
    committed Aug 7, 2016
  4. Allow any character in a quoted CSV field name.

    We used to force overly strict rules for a quoted field name in a CSV
    load file, now accept any character but a quote to be part of the field
    Fixes #416.
    committed Aug 7, 2016
Commits on Aug 6, 2016
  1. Implement support for existing target databases.

    Also known as the ORM case, it happens that other tools are used to
    create the target schema. In that case pgloader job is to fill in the
    exiting target tables with the data from the source tables.
    We still focus on load speed and pgloader will now DROP the
    constraints (Primary Key, Unique, Foreign Keys) and indexes before
    running the COPY statements, and re-install the schema it found in the
    target database once the data load is done.
    This behavior is activated when using the “create no tables” option as
    in the following test-case setup:
      with create no tables, include drop, truncate
    Fixes #400, for which I got a test-case to play with!
    committed Aug 6, 2016
Commits on Aug 5, 2016
  1. Use internal catalog when loading from files.

    Replace the ad-hoc code that was used before in the load from file code
    path to use our full internal catalog representation, and adjust APIs to
    that end.
    The goal is to use catalogs everywhere in the PostgreSQL target API and
    allowing to process reason explicitely about source and target catalogs,
    see #400 for the main use case.
    committed Aug 5, 2016
Commits on Aug 1, 2016
  1. Improve our internal catalog representation.

    First, add index and foreign keys to the list of objects supported by
    the shared catalog facility, where is was only found in the pgsql schema
    specific package for historical raisons.
    Then also add to our catalog internal structures the notion of a trigger
    and a stored procedure, allowing for cleaner advanced default values
    support in the MySQL cast functions.
    Once we now have a proper and complete catalog, review the pgsql module
    DDL output function in terms of the catalog and rewrite the schema
    creation support so that it takes direct benefit of our internal
    catalogs representation.
    In passing, clean-up the code organisation of the pgsql target support
    module to be easier to work with.
    Next step consists of getting rid of src/pgsql/queries.lisp: this
    facility should be replaced by the usage of a target catalog that we
    fetch the usual way, thanks to the new src/pgsql/pgsql-schema.lisp file
    and list-all-* functions.
    That will in turn allow for an explicit step of merging the pre-existing
    PostgreSQL catalog when it's been created by other tools than pgloader,
    that is when migrating with the help of an ORM. See #400 for details.
    committed Aug 1, 2016
Commits on Jul 31, 2016
  1. Clean-up overloaded parse rule for numbers.

    The MSSQL index filters parser needs to parse digits and keep them as
    text, but was piggybacking on the main parsers and the fixed file format
    positions parser by re-using the rule name "number".
    My understanding was that by calling `defrule' in different packages one
    would create a separate set of rules. It might have been wrong from the
    beginning or just changed in newer versions of esrap. Will have to
    investigate more.
    This fixes #434 while not applying suggested code: the comment about
    where to fix the bug is spot on.
    Also, it should be noted that the regression tests framework seems to be
    failing us and returns success in that error case, despite code
    installed to properly handle the situation. This will also need to be
    committed Jul 31, 2016
Commits on Jun 20, 2016
  1. Fix column names quoting in reset-all-sequences.

    The other user-provided names (schema and table) were already quoted
    using the quote_ident() PostgreSQL functio, but the column name (attname
    in the catalogs) were not.
    Blind attempt to fix #425.
    committed Jun 20, 2016
Commits on Jun 17, 2016
  1. Update bootstrap CentOS scripts (#424)

    * Corrects CentOS7 instruction (incorrect group name)
    * Update CentOS 6 bootstrap info
    - More recent SBCL (1.1 -> 1.3)
    - Missing freetds dependency
    gvangool committed with Jun 17, 2016
Commits on May 31, 2016
  1. Override encoding in every testing connection (#410)

    Also: reuse connection in process-regression-test.
    Fix #408.
    KrzysiekJ committed with May 31, 2016
Commits on May 18, 2016
  1. Add the “set not null” cast option for MySQL (#407)

    Use case: Django dissuades setting NULL “on string-based fields […]
    because empty string values will always be stored as empty strings, not
    as NULL. If a string-based field has null=True, that means it has two
    possible values for »no data«: NULL, and the empty string. In most
    cases, it’s redundant to have two possible values for »no data«; the
    Django convention is to use the empty string, not NULL.”.
    pgloader already supports custom transformations which can be used to
    replace NULL values in string-based columns with empty strings. Setting
    NOT NULL constraint on those columns could possibly be achieved by
    running a database query to extract their names and then generating
    relevant ALTER TABLE statements, but a cast option in pgloader is a more
    convenient way.
    KrzysiekJ committed with May 18, 2016
  2. Improve docs for FILENAMES MATCHING support.

    This format of source file specifications is available for CSV, COPY and
    FIXED formats but was only documented for the CSV one. The paragraph is
    copy/pasted around in the hope to produce per-format man pages and web
    documentation in a fully automated way sometime.
    Fix #397.
    committed May 18, 2016
Commits on May 16, 2016
  1. add the postgres debian ppa key in the correct way (#406)

    * add the postgres debian ppa key in the correct way
    * experimental: remove dist-upgrade
    * experimental: install asdf/sbcl via apt
    spaghetti- committed with May 16, 2016
Commits on May 5, 2016
Commits on Apr 27, 2016
  1. fix type drop to cascade (#393)

    if you have function or operator with  type which is removed, you will have error
    error: cannot drop type because other objects depend on it
    porshkevich committed with Apr 27, 2016
  2. Fix non-deterministic projection in MySQL query.

    In MySQL the information_schema.statistics table lists all indexes and
    has a row per index column, which means that the index level properties
    are duplicated on every row of the view.
    Our query against that catalog was lazily assuming the classic and
    faulty MySQL behavior where GROUP BY would allow non aggregated columns
    to be reported even when the result isn't deterministic, this patch
    fixes that by using a trick: the NON_UNIQUE column is 0 for a unique
    index and 1 otherwise, so we sum the numbers and process 0 equality.
    Fix #345 again.
    committed Apr 27, 2016