Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #5: Support for very large revisions and very large files. #9

Merged
merged 54 commits into from
Oct 5, 2015

Conversation

cstroe
Copy link
Owner

@cstroe cstroe commented Oct 5, 2015

File content for files committed to an SVN repository is now processed in chunks. This allows processing of any size files that may have been committed to an SVN repository.

To achieve this, the SvnDumpConsumer mechanism was made more granular, all the way to the file content level (consume(SvnNode), consume(FileContentChunk)). More consumption and ending methods were added to SvnDumpConsumer in order to support this.

The ending methods (endRevision, endNode, endChunks) were added because it allows the implementation code of SvnDumpConsumers to be much cleaner without much complexity added. It also gives a lot of flexibility over the parsing process.

Added an SvnFileDumpParserDoppelganger in order to support parsing SvnDump objects in memory. Possibly should be removed in the future and the functionality merged into SvnDumpFileParser.

cstroe and others added 30 commits September 30, 2015 08:28
… grained (down from SvnRevision to SvnNode).

* Move TerminatingValidator out of test source into main source, as it might be useful.
… having the Consumer do that.

* SvnDumpFileParser will not add nodes to an SvnRevision, as those should be consumed separately.
* Update SvnDumpWriterImpl and SvnDumpInMemory to work with this new way.
* Previously, this was implemented as a separate class, ConsumerChain.
* Because of the granularity changes to SvnDumpFileParser to parse SvnNodes separately from their SvnRevision, many SvnDumpConsumers can no longer properly operate.
* Instead of teaching the ConsumerChain how to represent and handle SvnNode additions/deletions that may come from an SvnDumpMutator, I'm teaching SvnDumpConsumers how to chain to each other.
* This removes the need for a separate ConsumerChain class.
* This also places the burder of continuing the consumer chain on each consumer, which is a downside of this approach.
Cosmin Stroe added 24 commits October 2, 2015 11:31
…ependently from other nodes in the revision.

* This means we don't have to hold all the nodes in memory at the same time, just one node.
* This should fix the "SvnRevision >30GB" requirement.
* We're going to split up the file content of SvnNodes into smaller pieces, so that we don't have to store the entire file content in memory.
* Add ending calls to SvnDumpConsumer: endRevision(...), endNode(...), and endChunks().  Makes SvnDumpWriterImpl code much cleaner and clearer.
…fixes #5.

* You can change the chunk size by calling setFileContentChunkSize(long) on the SvnDumpFileParser.
cstroe pushed a commit that referenced this pull request Oct 5, 2015
Fix #5: Support for very large revisions and very large files.
@cstroe cstroe merged commit 069a95d into master Oct 5, 2015
@cstroe cstroe deleted the issue5 branch October 5, 2015 04:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant