Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update chunker requirements #114

Merged
merged 1 commit into from
May 19, 2015
Merged

Update chunker requirements #114

merged 1 commit into from
May 19, 2015

Conversation

bjk-soundcloud
Copy link
Contributor

The chunker does not require an auto_increment column, it simply
requires an integer column in order to find the start/limit of the
table.

Relax the migrator/table checks to only look for an id INT column.

  • Add support for BIGINT id columns
  • Add fixture for BIGINT id column
  • Add fixture for "broken" varchar id column
  • Update specs

@bjk-soundcloud
Copy link
Contributor Author

@hannestyden There are still some specs to fix, but I wanted to get this out early for review.

@bjk-soundcloud bjk-soundcloud force-pushed the bjk/autoinc branch 2 times, most recently from 1687817 to 262540b Compare May 14, 2015 13:39
@grobie
Copy link
Contributor

grobie commented May 14, 2015

While it's true that the chunker does not strictly require an auto_increment column, it's implemented on the assumption that the identifier column is monotonically increasing. AFAIK the chunker behaves quite poorly in case of huge gaps between rows.

I'd suggest to add a note to the README explaining that a monotonically increasing counter is recommended / warning that a migration will be quite slow with column of different value distributions.

@bjk-soundcloud
Copy link
Contributor Author

The original version of LHM did not require auto_increment. This requirement was added at the time we added support for id as a surrogate key. But it breaks LHM for things like the "purgatory" tables where they are a mirror of a schema, but have sparse data that was removed from their mirror.

The real issue is when the id data is sparse. I would like to re-write the chunker to use pt-online-schema-change style linked pagination. This would solve a the sparseness issue.

I can add a note about sparse id issues.

@bjk-soundcloud bjk-soundcloud force-pushed the bjk/autoinc branch 3 times, most recently from 33ba4ac to d22340b Compare May 14, 2015 14:29
@bjk-soundcloud bjk-soundcloud changed the title [WIP] Update chunker requirements Update chunker requirements May 14, 2015
@bjk-soundcloud
Copy link
Contributor Author

@arthurnn @hannestyden Ok, all the tests pass.

@bjk-soundcloud
Copy link
Contributor Author

Ping, @hannestyden. 😄

@bjk-soundcloud
Copy link
Contributor Author

@grobie How does this look to you now?

@@ -7,6 +7,7 @@ end
```
* #98 - Add slave lag throttler. (@camilo, @jasonhl)
* #92 - Fix check for table requirement before starting a lhm.(@hannestyden)
* #114 - Update chunker requirements (@bjk-soundcloud)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this is in the middle? (can you move it to the top? )

The chunker does not require an auto_increment column, it simply
requires an integer column in order to find the start/limit of the
table.

Relax the migrator/table checks to only look for an `id` INT column.
* Add support for BIGINT `id` columns
* Add fixture for BIGINT `id` column
* Add test for BIGINT `id` columns
* Add fixture for "broken" varchar `id` column
* Update specs
* Update README with changes, and note about sparse `id` data
@hannestyden
Copy link
Contributor

Looking good! 👍

@janogonzalez
Copy link

👍

bjk-soundcloud added a commit that referenced this pull request May 19, 2015
@bjk-soundcloud bjk-soundcloud merged commit c8ca500 into master May 19, 2015
@bjk-soundcloud bjk-soundcloud deleted the bjk/autoinc branch May 19, 2015 08:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants