Improve "You forgot to include one of the files/devices" error #1535

coffeemug · 2013-10-11T06:08:25Z

A few times people got the error message You forgot to include one of the files/devices that you included when the database was first created. Unfortunately it's really confusing, as it gives no additional information.

We should change it to print the names of the missing files it's expecting.

The text was updated successfully, but these errors were encountered:

danielmewes · 2013-10-11T21:26:39Z

This error does not seem to have anything to do with missing files anymore. At least I can delete any of the files in rethinkdb_data and I do not see this error message. Maybe it's caused by corrupted metadata?

This of course makes it even more confusing and only adds to the need to change the error message once we know what this error really means nowadays.

coffeemug · 2013-10-12T06:28:54Z

Cool, we'll talk to @srh about it when he gets back. If anyone knows what this means -- it's him.

danielmewes · 2013-10-14T20:29:29Z

@mlucy seems to be investigating this as part of #1534 (comment)

mlucy · 2013-10-15T05:48:32Z

I'm looking into this a little more, but my current working hypothesis is that you get this error message when you load a 32-bit rethinkdb_data directory with 64-bit rethinkdb.

mlucy · 2013-10-16T07:33:22Z

I can confirm that once cause of this issue is loading a 32-bit rethinkdb_data directory with 64-bit rethinkdb. We should fix this.

coffeemug · 2013-10-17T21:26:54Z

@mlucy -- would you mind opening a separate issue for that? (It seems different from fixing the "forgot to include files" error).

mlucy · 2013-10-18T07:44:48Z

@coffeemug -- I don't think you actually get this error if you forget to include a file. I think this is a general "fix this error message to say what's actually going wrong" issue.

coffeemug · 2013-10-18T16:31:58Z

I see, makes sense. So from what I understand there are three questions/issues here:

Fix the 32/64 bit issue
Once that's fixed, figure out in what circumstances the message is actually printed (there were some questions about that)
Improve the error message itself

danielmewes · 2013-10-18T16:39:18Z

As far as I understand it will never be printed unless the DB is corrupted (or we have other bugs like the 32bit one). Maybe it should just be made a guarantee?

coffeemug · 2013-10-23T22:05:27Z

Assigning to @Tryneus.

Tryneus · 2013-11-01T20:58:07Z

So, as near as I can tell, this is a combination of two problems. First of all, the multiplexer_config_block_t type is neither __attribute__((packed)) nor padded. Secondly, its creation_timestamp_t member is a typedef of time_t, which has a platform-dependent size.

The first is a simple enough fix. As for time_t, we could convert all existing time_t code to use a statically-sized type, or add padding in a wrapper struct, any suggestions?

mlucy · 2013-11-01T21:01:54Z

Using a platform-independent size for time_t sounds best to me.

Tryneus · 2013-11-01T23:36:03Z

Ok, after fixing those, ran across another crash, so there's more to do.

error: Error in ../src/buffer_cache/semantic_checking.tcc at line 146:
error: Assertion failed: [block_id == get_block_id()] 
error: Backtrace:
error: Sat Oct 19 03:36:07 2013

       1: rethinkdb_backtrace(void**, int) at thread_stack_pcs.cc:151
       2: lazy_backtrace_t::lazy_backtrace_t() at backtrace.cc:250
       3: format_backtrace(bool) at backtrace.cc:197
       4: report_fatal_error(char const*, int, char const*, ...) at errors.cc:68
       5: scc_buf_lock_t<mc_cache_t>::scc_buf_lock_t(scc_transaction_t<mc_cache_t>*, unsigned long long, access_t, buffer_cache_order_mode_t, lock_in_line_callback_t*) at semantic_checking.tcc:146
       6: btree_store_t<rdb_protocol_t>::acquire_sindex_block_for_read(read_token_pair_t*, scc_transaction_t<mc_cache_t>*, scoped_ptr_t<scc_buf_lock_t<mc_cache_t> >*, unsigned long long, signal_t*) at btree_store.cc:324
       7: btree_store_t<rdb_protocol_t>::btree_store_t(serializer_t*, std::string const&, long long, bool, perfmon_collection_t*, rdb_protocol_t::context_t*, io_backender_t*, base_path_t const&) at btree_store.cc:78
       8: rdb_protocol_t::store_t::store_t(serializer_t*, std::string const&, long long, bool, perfmon_collection_t*, rdb_protocol_t::context_t*, io_backender_t*, base_path_t const&) at protocol.cc:1096
       9: void do_construct_existing_store<rdb_protocol_t>(std::vector<threadnum_t, std::allocator<threadnum_t> > const&, int, store_args_t<rdb_protocol_t>, serializer_multiplexer_t*, scoped_array_t<scoped_ptr_t<rdb_protocol_t::store_t> >*, store_view_t<rdb_protocol_t>**) at file_based_svs_by_namespace.cc:54
       10: void boost::_bi::list6<boost::_bi::value<std::vector<threadnum_t, std::allocator<threadnum_t> > >, boost::arg<1>, boost::_bi::value<store_args_t<rdb_protocol_t> >, boost::_bi::value<serializer_multiplexer_t*>, boost::_bi::value<scoped_array_t<scoped_ptr_t<rdb_protocol_t::store_t> >*>, boost::_bi::value<store_view_t<rdb_protocol_t>**> >::operator()<void (*)(std::vector<threadnum_t, std::allocator<threadnum_t> > const&, int, store_args_t<rdb_protocol_t>, serializer_multiplexer_t*, scoped_array_t<scoped_ptr_t<rdb_protocol_t::store_t> >*, store_view_t<rdb_protocol_t>**), boost::_bi::list1<int const&> >(boost::_bi::type<void>, void (* const&)(std::vector<threadnum_t, std::allocator<threadnum_t> > const&, int, store_args_t<rdb_protocol_t>, serializer_multiplexer_t*, scoped_array_t<scoped_ptr_t<rdb_protocol_t::store_t> >*, store_view_t<rdb_protocol_t>**), boost::_bi::list1<int const&>&, int) const at bind.hpp:594
       11: void boost::_bi::bind_t<void, void (*)(std::vector<threadnum_t, std::allocator<threadnum_t> > const&, int, store_args_t<rdb_protocol_t>, serializer_multiplexer_t*, scoped_array_t<scoped_ptr_t<rdb_protocol_t::store_t> >*, store_view_t<rdb_protocol_t>**), boost::_bi::list6<boost::_bi::value<std::vector<threadnum_t, std::allocator<threadnum_t> > >, boost::arg<1>, boost::_bi::value<store_args_t<rdb_protocol_t> >, boost::_bi::value<serializer_multiplexer_t*>, boost::_bi::value<scoped_array_t<scoped_ptr_t<rdb_protocol_t::store_t> >*>, boost::_bi::value<store_view_t<rdb_protocol_t>**> > >::operator()<int>(int const&) const at bind_template.hpp:54
       12: void pmap<boost::_bi::bind_t<void, void (*)(std::vector<threadnum_t, std::allocator<threadnum_t> > const&, int, store_args_t<rdb_protocol_t>, serializer_multiplexer_t*, scoped_array_t<scoped_ptr_t<rdb_protocol_t::store_t> >*, store_view_t<rdb_protocol_t>**), boost::_bi::list6<boost::_bi::value<std::vector<threadnum_t, std::allocator<threadnum_t> > >, boost::arg<1>, boost::_bi::value<store_args_t<rdb_protocol_t> >, boost::_bi::value<serializer_multiplexer_t*>, boost::_bi::value<scoped_array_t<scoped_ptr_t<rdb_protocol_t::store_t> >*>, boost::_bi::value<store_view_t<rdb_protocol_t>**> > > >(int, boost::_bi::bind_t<void, void (*)(std::vector<threadnum_t, std::allocator<threadnum_t> > const&, int, store_args_t<rdb_protocol_t>, serializer_multiplexer_t*, scoped_array_t<scoped_ptr_t<rdb_protocol_t::store_t> >*, store_view_t<rdb_protocol_t>**), boost::_bi::list6<boost::_bi::value<std::vector<threadnum_t, std::allocator<threadnum_t> > >, boost::arg<1>, boost::_bi::value<store_args_t<rdb_protocol_t> >, boost::_bi::value<serializer_multiplexer_t*>, boost::_bi::value<scoped_array_t<scoped_ptr_t<rdb_protocol_t::store_t> >*>, boost::_bi::value<store_view_t<rdb_protocol_t>**> > > const&) at pmap.hpp:48
       13: file_based_svs_by_namespace_t<rdb_protocol_t>::get_svs(perfmon_collection_t*, uuid_u, long long, stores_lifetimer_t<rdb_protocol_t>*, scoped_ptr_t<multistore_ptr_t<rdb_protocol_t> >*, rdb_protocol_t::context_t*) at file_based_svs_by_namespace.cc:142
       14: watchable_and_reactor_t<rdb_protocol_t>::initialize_reactor(io_backender_t*) at reactor_driver.tcc:312
       15: boost::_mfi::mf1<void, watchable_and_reactor_t<rdb_protocol_t>, io_backender_t*>::operator()(watchable_and_reactor_t<rdb_protocol_t>*, io_backender_t*) const at mem_fn_template.hpp:163
       16: void boost::_bi::list2<boost::_bi::value<watchable_and_reactor_t<rdb_protocol_t>*>, boost::_bi::value<io_backender_t*> >::operator()<boost::_mfi::mf1<void, watchable_and_reactor_t<rdb_protocol_t>, io_backender_t*>, boost::_bi::list0>(boost::_bi::type<void>, boost::_mfi::mf1<void, watchable_and_reactor_t<rdb_protocol_t>, io_backender_t*>&, boost::_bi::list0&, int) at bind.hpp:307
       17: boost::_bi::bind_t<void, boost::_mfi::mf1<void, watchable_and_reactor_t<rdb_protocol_t>, io_backender_t*>, boost::_bi::list2<boost::_bi::value<watchable_and_reactor_t<rdb_protocol_t>*>, boost::_bi::value<io_backender_t*> > >::operator()() at bind_template.hpp:21
       18: callable_action_instance_t<boost::_bi::bind_t<void, boost::_mfi::mf1<void, watchable_and_reactor_t<rdb_protocol_t>, io_backender_t*>, boost::_bi::list2<boost::_bi::value<watchable_and_reactor_t<rdb_protocol_t>*>, boost::_bi::value<io_backender_t*> > > >::run_action() at callable_action.hpp:28
       19: callable_action_wrapper_t::run() at runtime_utils.cc:67
       20: coro_t::run() at coroutines.cc:178

Tryneus · 2013-11-04T20:58:25Z

Ok, had to add __attribute__((packed)) to two more structures: btree_superblock_t and metablock_manager_t<metablock_t>::crc_metablock_t, and I can get the database files running between 32-bit and 64-bit architectures. Going to do a little more testing, but we still can't guarantee this works unless we are sure that all data serialization is also safe, which is a little tougher to do.

Tryneus · 2013-11-05T01:01:43Z

Alright, fixes are up in code review 1007. I have not run across any other crashes during testing. The only thing I am slightly concerned about is that reql_time object precision is slightly different when I export the same database on 32-bit and 64-bit machines. ex:

32-bit export:

"datetime": {"timezone": "+00:00", "$reql_type$": "TIME", "epoch_time": 1383607457.7049999}

64-bit export:

"datetime": {"timezone": "+00:00", "$reql_type$": "TIME", "epoch_time": 1383607457.705}

As you can see, the 64-bit data is 'more' correct, since we are supposed to truncate to milliseconds. I have not tracked down the source of this discrepency, but it appears to happen in just 32-bit alone, so at the moment I believe it is just a problem with our reql_time implementation on the 32-bit architecture. In any case, this can be its own issue, as it is not a problem with our on-disk format's portability.

Tryneus · 2013-11-06T03:10:13Z

The portability changes have been approved and merged to next in commits 4bdab21, c1097e3, f005bfa, and 9a9dcfc. This will be in release 1.11.

I also investigated the minor differences I found in exported time objects, and they appear to be a result of the python implementation on the two systems, there was no difference in the data received by the client. Therefore, there will be no new issue for that.

coffeemug mentioned this issue Oct 11, 2013

Crash fetching some documents or dumping data #1534

Closed

ghost assigned srh Oct 12, 2013

mlucy mentioned this issue Oct 15, 2013

In place migration tool for rethinkdb data #1010

Closed

ghost assigned Tryneus Oct 23, 2013

Tryneus closed this as completed Nov 6, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve "You forgot to include one of the files/devices" error #1535

Improve "You forgot to include one of the files/devices" error #1535

coffeemug commented Oct 11, 2013

danielmewes commented Oct 11, 2013

coffeemug commented Oct 12, 2013

danielmewes commented Oct 14, 2013

mlucy commented Oct 15, 2013

mlucy commented Oct 16, 2013

coffeemug commented Oct 17, 2013

mlucy commented Oct 18, 2013

coffeemug commented Oct 18, 2013

danielmewes commented Oct 18, 2013

coffeemug commented Oct 23, 2013

Tryneus commented Nov 1, 2013

mlucy commented Nov 1, 2013

Tryneus commented Nov 1, 2013

Tryneus commented Nov 4, 2013

Tryneus commented Nov 5, 2013

Tryneus commented Nov 6, 2013

Improve "You forgot to include one of the files/devices" error #1535

Improve "You forgot to include one of the files/devices" error #1535

Comments

coffeemug commented Oct 11, 2013

danielmewes commented Oct 11, 2013

coffeemug commented Oct 12, 2013

danielmewes commented Oct 14, 2013

mlucy commented Oct 15, 2013

mlucy commented Oct 16, 2013

coffeemug commented Oct 17, 2013

mlucy commented Oct 18, 2013

coffeemug commented Oct 18, 2013

danielmewes commented Oct 18, 2013

coffeemug commented Oct 23, 2013

Tryneus commented Nov 1, 2013

mlucy commented Nov 1, 2013

Tryneus commented Nov 1, 2013

Tryneus commented Nov 4, 2013

Tryneus commented Nov 5, 2013

Tryneus commented Nov 6, 2013