Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modification duplicate removal can fail for large source control histories. #10052

Open
Peterburnett opened this issue Jan 5, 2022 · 5 comments
Labels
bug materials no stalebot Don't mark this stale.

Comments

@Peterburnett
Copy link

Peterburnett commented Jan 5, 2022

Issue Type
  • Bug Report
Summary

When attempting a material update that has greater than 32767 modifications, the deduplication check here Seems to attempt to pass all of them into the query, causing a driver error:

Caused by: java.io.IOException: Tried to send an out-of-range integer as a 2-byte value: 82273

leading to

org.springframework.dao.DataAccessResourceFailureException: could not execute query; nested exception is org.hibernate.exception.JDBCConnectionException: could not execute query
        at org.springframework.orm.hibernate3.SessionFactoryUtils.convertHibernateAccessException(SessionFactoryUtils.java:639)
        at org.springframework.orm.hibernate3.HibernateAccessor.convertHibernateAccessException(HibernateAccessor.java:416)
        at org.springframework.orm.hibernate3.HibernateTemplate.doExecute(HibernateTemplate.java:416)
        at org.springframework.orm.hibernate3.HibernateTemplate.executeWithNativeSession(HibernateTemplate.java:379)
        at org.springframework.orm.hibernate3.HibernateTemplate.findByCriteria(HibernateTemplate.java:1126)
        at org.springframework.orm.hibernate3.HibernateTemplate.findByCriteria(HibernateTemplate.java:1118)
        at com.thoughtworks.go.server.persistence.MaterialRepository.checkAndRemoveDuplicates(MaterialRepository.java:797)
Environment
  • Go Version: 21.2.0 (12498-16e1ac6956cd5177a99dc3fe33503661881c354f)
  • JAVA Version: 15.0.2
  • OS: Linux 5.8.0-1041-aws
Steps to Reproduce
  1. Create a git repo and branch to use for dumping a large amount of history into. The branch should have a small history, a fresh git init is ideal, with a single commit of an empty file or similar.
  2. Create a test pipeline with completely dummy data, with a Git material pointed at the repo and branch.
  3. Trigger this pipeline, it can be cancelled after it is started, and the changes link on the pipeline registers a commit there.
  4. Use git remote add moodle https://github.com/moodle/moodle.git to add a project as a remote that has a large master history
  5. Use git merge -X theirs --allow-unrelated-histories moodle/master to merge the remote project into your repo and branch
  6. Push this up to the repo, and trigger a new pipeline.
  7. Wait a few mins, and get a global error about a dead transaction.
Expected Results

Pipeline should correctly parse all the commits, and show the latest revision as the latest commit in the remote repo merged.

Actual Results

Global error

Possible Fix

Chunk the deduplication check query so that it cant exceed the driver parameter integer limit.

@chadlwilson
Copy link
Member

chadlwilson commented Jan 5, 2022

Thanks for the well articulated report!

Haven't tried it, and may have misunderstood the error location, but I wonder if there is a workaround where you trigger with options a specific revision way back in history with less than 32k commits and then progressively trigger a couple of subsequent builds. It's possible you can't even get past that step though as might rely on the same code in trigger-with-options to validate the revisions. In which case I wonder if there is a way to edit the git material to achieve something similar (maybe not). 🤔

@arvindsv
Copy link
Member

arvindsv commented Jan 5, 2022

I wonder if setting up that material for shallow cloning helps.

@chadlwilson
Copy link
Member

chadlwilson commented Jan 5, 2022

I wonder if setting up that material for shallow cloning helps.

My recollection is that shallow cloning only applies on agents, and the server always does full --no-checkout clones as it still needs to get the full revision history to be able to collect the delta from previous run (the bit probably triggering the error above). But worth a go.

@stale
Copy link

stale bot commented Apr 16, 2022

This issue has been automatically marked as stale because it has not had activity in the last 90 days.
If you can still reproduce this error on the master branch using local development environment or on the latest GoCD Release, please reply with all of the information you have about it in order to keep the issue open.
Thank you for all your contributions.

@stale stale bot added the stale label Apr 16, 2022
@stale stale bot closed this as completed Apr 24, 2022
@chadlwilson chadlwilson added no stalebot Don't mark this stale. and removed stale labels Apr 24, 2022
@chadlwilson chadlwilson reopened this Apr 24, 2022
@chadlwilson
Copy link
Member

Related to #7788 and #443

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug materials no stalebot Don't mark this stale.
Projects
None yet
Development

No branches or pull requests

3 participants