Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WT-2381: dump utility discards table config #2493

Merged
merged 31 commits into from Mar 14, 2016
Merged

WT-2381: dump utility discards table config #2493

merged 31 commits into from Mar 14, 2016

Conversation

keithbostic
Copy link
Contributor

@sueloverso, I ended up waiting for a bunch more tests to run today, so I went a little further on this one.

I think what's going on here is that simple tables are a special case: there's no colgroup entry in a simple table's metadata entry, and we're not finding the table's underlying file information.

This works for complex tables because their column-groups are listed in the table's metadata entry: even though we will write the same (incorrect) information for the complex table that we write for the simple table, it will be overridden by the correct information stored for the specific column-groups.

I did a "fix" by special-casing simple tables in the dump code, but that could be completely wrong. I don't have a handle on how this "ought" to behave, it's just the only path to victory I saw.

I thought a little bit about testing: the only thing that came to mind was changing test_dump to compare the metadata before and after the dump and re-load (dump the WiredTiger.wt file before and after the dump/re-load and assert it's the same). Obviously, you have to strip all of the checkpoint information and maybe some other stuff before that will work -- I didn't go far down that path. We'd need some way to parse the dump output in Python, too. There's a Python package to do that kind of parsing (pyparsing, or here), but we'd have to make sure it's installed to use it?

Anyway, hope this is useful to you, just toss the branch if it's not.

The loadtext command requires a URI argument.
Rewrite the command-line tool entries that refer to "tables or files"
to simply refer to tables, it's simpler and less confusing, and users
are unlikely to be using file URIs.
Fix for WiredTiger "simple" table handling. Simple tables have column-group
entries, but they aren't listed in the metadata's table entry. Figure out if
it's a simple table and in that case, retrieve the column-group entry and
use the value from its "source" file.
@keithbostic keithbostic self-assigned this Feb 12, 2016
@sueloverso
Copy link
Member

@keithbostic I added a new unit test. It fails 3 out of 5 scenarios. The good news is that your change on this branch does help - on develop it fails 4 out of 5 scenarios.

keithbostic and others added 23 commits February 24, 2016 16:27
Include all of the WT_SESSION::create config in the ordinary LSM
metadata so it is merged correctly into the dump header.  Provide
an upgrade path for LSM metadata in the old format.

** Backwards bracking change for LSM: ** once metadata is upgraded
to the new format, LSM trees cannot be opened with older versions of
WiredTiger.
Allow any URI through with an underlying type of file, that allows the
creation of top-level lsm:XXX objects, that is, LSM objects that aren't
underneath tables.
LSM doesn't support column-store keys, don't try to test that combination.
Parenthesize a macro argument.
Add LSM tables to the dump test.

After dumping/re-loading the database, confirm that the contents of the
database are the same by comparing the objects returned by wt list.

Replace simple_populate_check_cursor/complex_populate_check_cursor with
simple_populate_check/complex_populate_check, then we don't have to open
a cursor.
Rework the dump utility and the dump library support. Previously, the
metadata cursor returned a full set of WT_SESSION.create configuration
values, basically, the requested configuration values plus the default
values, where the requested configuration values overrode the default
configuration values. This doesn't work because dump takes a few
configuration strings and collapses them into a single string, and the
default configuration values start overriding real configurtion values.
Change the metadata cursor to return only the requested configuration
values instead, and change dump to add in the default values when it's
collapsing the strings into a single string.

Add a function __wt_schema_create_final; it takes a set of configuration
strings, adds in the default WT_SESSION.create configuration values, and
collapses them into a single string.

Add a function __wt_config_strip_and_collapse; it behaves similarly to
__wt_config_collapse, except it doesn't add in the default strings.

Fix bugs where we weren't copying returned metadata strings into local
memory.

Fix bugs where we weren't correctly parsing the URIs in the metadata
file.
One of the changes in 77ac147 changed the test for column-group and
index names, and the changed version matches both simple and complex
entries. Leave the changed test alone, instead don't look for separate
column-group and index entries in the case of a simple table.
WT-2381 Rewrite LSM metadata to fix dump / load.
@keithbostic
Copy link
Contributor Author

And it removes the need for metadata:create. It does mean we'd return "special" stuff out of the metadata file, but I don't think we care. Good catch, I'll take a run at it.

Michael notes dump no longer needs to use the metadata:create URI, that
simplifies the change, most importantly, we no longer need two versions
of config_collapse.
@keithbostic
Copy link
Contributor Author

@michaelcahill, I've gone ahead and pushed the change you suggested.

As far as I'm concerned, this one is ready for merge.

@michaelcahill
Copy link
Contributor

Thanks, @keithbostic, lgtm. I'll merge.

michaelcahill added a commit that referenced this pull request Mar 14, 2016
WT-2381: dump utility discards table config
@michaelcahill michaelcahill merged commit 9714aa7 into develop Mar 14, 2016
@michaelcahill michaelcahill deleted the wt-2381 branch March 14, 2016 02:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants