Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CXX-727 Make it possible to rename collections #367

Closed
wants to merge 3 commits into from

Conversation

koraa
Copy link
Contributor

@koraa koraa commented Nov 4, 2015

No description provided.

The method was unimplemented and renaming databases is
currently not something mongodb provides for it's drivers.

See https://jira.mongodb.org/browse/SERVER-701
/// @param overwrite_old Whether to overwrite any
/// existing collections called new_name. Default: false
/// @throws exception::operation if the operation fails.
void name(stdx::string_view new_name, bool overwrite_old = false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rename overwrite_old to drop_target_before_rename to match the C driver?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling this rename would be more clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, I will change both.
drop_target_before_rename hmm. IMO overwrite_old is much more clear.
rename Sure about that? I deliberately elected not to call it rename, because this implements a getter/setter pattern. If mongocxx generally uses getters/setters I think it would be more idiomatic to consider the name a gettable and settable property too.

@Machyne
Copy link
Contributor

Machyne commented Nov 4, 2015

Thanks for the PR and the work! :) I'll be happy to merge this in after you fix some small comments.

void collection::name(stdx::string_view new_name, bool overwrite_old) {
bson_error_t error;

_impl->name = (std::string)new_name;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this line to line 83 in case the rename fails. Also, please just write _impl->name = new_name. The cast should not be needed as string_view defines a conversion operator to std::string.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: for my original comment I was confused and looking at mongoc_collection_t instead of the impl class, follow @amidvidy 's comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amidvidy I initially did not use the cast, but it failed. Further investigation revealed that the string conversion operator in mnmlstc is marked explicit.

https://mnmlstc.github.io/core/string.html#basic_string_view:T:::castto-std::basic_string:T:-operator

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understood. In that case, please use static_cast, we try to avoid C-style casts.

@amidvidy
Copy link
Contributor

amidvidy commented Nov 4, 2015

Can you add a unit test to collection_mocked showing that _impl->name changes to the new name if the call to the C-driver succeeds, and that _impl->name does not change if the call to the C-driver fails? @Machyne can help you out with the mocking framework. Thanks again for the PR.

auto result = mongoc_collection_rename(
_impl->collection_t
, _impl->database_name.c_str()
, _impl->name.c_str()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should just be new_name.data(). :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure about that?

https://mnmlstc.github.io/core/string.html#basic_string_view:T:::dataCCE

"This pointer is not guaranteed to be null terminated, and should be treated as such."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting... I suggested it because it is used elsewhere (ex: collection constructor).
c_str is probably safer though.

@koraa
Copy link
Contributor Author

koraa commented Nov 4, 2015

@amidvidy Is there any particular reason why collection::name() stores the name internally rather than returning a string view on mongoc_collection_get_name()?

If we altered name() to use that we could free ourselves from the added state synchronization and also save on the added test.

@koraa koraa force-pushed the karo/rename branch 3 times, most recently from fdc00c4 to d7bd777 Compare November 4, 2015 17:30
@amidvidy
Copy link
Contributor

amidvidy commented Nov 4, 2015

good idea @koraa, I do not see why it is needed (some of this code predates me). I think that would be a good change to make. If you do that, we can remove the name member from collection::impl as well.

@koraa
Copy link
Contributor Author

koraa commented Nov 4, 2015

perfect...

void collection::rename(stdx::string_view new_name, bool drop_target_before_rename) {
bson_error_t error;

auto result = mongoc_collection_rename(_impl->collection_t, _impl->database_name.c_str(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @koraa , please also add this symbol to the list in src/mongocxx/private/libmongoc_symbols.hpp and then call libmongoc::collection_rename instead of calling mongoc_collection_rename directly.

@samantharitter
Copy link
Contributor

Thanks for your work @koraa !

On the name() vs. rename() debate, I prefer rename(). Yes, we do follow a getter/setter pattern in much of the codebase, but most of those are for attributes that exist in driver classes for convenience. This is a bit different because it translates directly to a database operation, and since that's called renameCollection() we should use rename() here.

Storing the name and returning it with name() actually makes some sense. You can't get a valid collection object without knowing its name, and since we have it we can avoid network round trips to the database if we just cache the name. With renaming, we'll have to update the cached value.

@koraa
Copy link
Contributor Author

koraa commented Nov 4, 2015

Well, then. It's got tests now (let's see it whether the CI successfully runs them because locally they did not work in the first place).

I am also using sring_view::data() now to get the c_string, but strictly speaking this is not correct: the string returned by data() is not guaranteed to be a correctly zero-terminated string. It frequently contains one, when it represents a full std::string or a full c string, but in cases where string_view is most useful – when it references part of a bigger underlying buffer – it wont be.

We could try to solve this by creating a helper function that checks whether string_view::end() references a \0 byte, but this optimization is usually not applicable, because we don't know anything about the character referenced by end(); it could be valid, but it also could be part of a buffer used and written to by another thread or it could be unallocated memory.
Ultimately we need to copy the string.

See https://groups.google.com/a/isocpp.org/forum/#!topic/std-proposals/LEPhIf_msXw

Optimally you would get mongoc/bsonc to support string parameters with sizes.

@@ -466,4 +466,24 @@ TEST_CASE("Collection", "[collection]") {
REQUIRE(bulk_operation_execute_called);
REQUIRE(bulk_operation_destroy_called);
}

SECTION("Rename Fails, name stays the same", "collection::rename") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you run clang-format on this too?

@Machyne
Copy link
Contributor

Machyne commented Nov 4, 2015

Thanks for adding the test!

I agree with your comment on .data(); it would be great if libmongoc took lengths instead of require null delimited strings, but for now we'll have to make a copy. I'm glad that you found this!

@koraa
Copy link
Contributor Author

koraa commented Nov 4, 2015

@Machyne Actually I think the correct way to address this is to redesign the API to take strings OR c strings (with the string implementation just extracting the c string). This way copying is only necessary in rare cases and explicit.

Then you should get mongoc to take strings with lengths and then go back to using string_views.

If you just copy unconditionally, you will create lot's of unnecessary overhead.

@koraa
Copy link
Contributor Author

koraa commented Nov 4, 2015

@Machyne As it turns out, std::string old_name{mongo_coll.name()}; is incorrect. It breaks in clang. Thus I believe the c-style cast is indeed the correct way of invoking an explicit cast operator.

C++14 is totally awesome, but I believe that syntax inherited from C and C++03 still have very valid applications in modern C++14 code for their specific use cases. Don't discourage a syntax just because it seems old :)

@samantharitter
Copy link
Contributor

The reason we discourage C-style casts in the driver is that C++ provides its own casts, and these are safer as they allow the compiler to do more checking. They also require the programmer to be much more deliberate about the cast and are much easier to search for in code.

However because this isn't a performance-critical path (renaming collections often is kind of an anti-pattern) I think the best solution here is for us to make the extra copy and build a string from the string_view. Doesn't string_view provide a to_string() method we could use for this?

We don't really want to provide an interface that takes char*s. One thing we've tried very hard to do is to fully separate the user-facing interface from the underlying implementation (the C driver). Having a char* interface leaks that detail in a weird way. Ideally the C driver would provide an interface taking a char* and len and we could use that to avoid the copies while still using string_view.

Thank you for finding this, we'll have to do an audit of other string_view uses in the codebase. I've opened CXX-730 to track that work.

@koraa
Copy link
Contributor Author

koraa commented Nov 4, 2015

@samantharitter I am well aware of the c++ casts, but this has nothing to do with runtime polymorphism. This is explicitly a call to the string view's explicit-marked cast operator.

@koraa
Copy link
Contributor Author

koraa commented Nov 5, 2015

By the way, I just looked it up and libmongoc actually does not query the database in order to retrieve the database name; it is already caching the database name in an internal structure, which kind of makes sense since mongoc_collection_get_name() needs to return a persistent pointer.
Interestingly, since that function just returns a pointer to that internal buffer, all of thosee pointers will automatically reference the new value when it's changed.

https://github.com/mongodb/mongo-c-driver/blob/master/src/mongoc/mongoc-collection-private.h#L33
https://github.com/mongodb/mongo-c-driver/blob/master/src/mongoc/mongoc-collection.c#L1656

@koraa koraa force-pushed the karo/rename branch 3 times, most recently from 1554f42 to af3d0d7 Compare November 5, 2015 11:07
@samantharitter
Copy link
Contributor

Hi @koraa, sorry for the delay in getting back to you. Thanks for looking into the libmongoc code, I should not have assumed that it was running a db operation there.

I wasn't aware that string_data had an operator basic_string, that's very handy. However, since that conversion is still going to copy the underlying string, let's either put a comment explaining the intention there above the call, or use to_string(), which will have the same effect but is in my opinion clearer.

@@ -44,6 +44,8 @@ MONGOCXX_LIBMONGOC_SYMBOL(client_pool_try_pop)
MONGOCXX_LIBMONGOC_SYMBOL(client_set_read_prefs)
MONGOCXX_LIBMONGOC_SYMBOL(client_set_ssl_opts)
MONGOCXX_LIBMONGOC_SYMBOL(client_set_write_concern)
MONGOCXX_LIBMONGOC_SYMBOL(collection_get_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sort these new entries into alphabetical order with the other collection_* symbols

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(oh, right. Thought I'd done that)

@koraa
Copy link
Contributor Author

koraa commented Nov 9, 2015

Hi @samantharitter,
thanks for getting back to me: The current version of the code does not do any string conversion any more since I entirely removed name() in favor of using the mongoc function; this eradicated the need to store the name somewhere.

Neither did I copy the string_view to fix the problem with the lack of a zero terminator since I thought you'd have some internal discussion about how you're going to deal with that.

@samantharitter
Copy link
Contributor

I saw that you'd gone back to using .data(), that's totally fine. Under the umbrella of CXX-730 we're going to go through and fix our uses of string_view, we can address this use there instead.

@@ -38,8 +38,7 @@ class collection::impl {
const class client::impl* client, stdx::string_view name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since name is no longer used here, let's remove it. I think the only place we call this constructor is collection.cpp line 83.

@Machyne
Copy link
Contributor

Machyne commented Nov 9, 2015

One small comment, otherwise LGTM! Thanks again for the PR :)

Previous to this we have been caching the name of
a collection, but this is not necessary since the C api
already does the same.
Until now we had support for getting a collection's name by
using the getter collection::name(...); this overloads this
method with an additional name setter.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants