Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prefer string_view in the API over std::string #2553

Merged
merged 1 commit into from
Jan 8, 2024
Merged

Conversation

benjaminwinger
Copy link
Collaborator

@benjaminwinger benjaminwinger commented Dec 5, 2023

For #2521, but it also is generally preferable over const std::string& since std::string already has a layer of indirection to refer to the owned string data.

This change doesn't really have any cost other than in QueryResult::writeToCSV I think (where the string_view needs to be converted to a const char* to pass it to ofstream::open, which may not work since std::string_view isn't necessarily null-terminated, so I have it converting it to a std::string to ensure that it is).

I also modified Database to store an absolute path, instead of a copy of the path passed to it, to solve a minor issue I'd noticed: that the database will try to move if you chdir after creating it using a relative path. (Removed as this won't play nicely with the VFS at the moment, and integrating the changes would be more work that should probably be done separately).

This doesn't by itself fix #2521. In addition to some functions where it would be more complicated to remove the std::string (e.g. QueryResult::toString), some of the types themselves contain STL types which won't have a consistent size between the release and debug builds. To fix that, we can hide the implementations so that the Database, Connection, QueryResult, etc. are just wrappers containing a unique_ptr to internal types which contain the actual implementation (which will also keep our exposed API more concise), but that's more complicated and can be done in a separate PR.

@benjaminwinger benjaminwinger force-pushed the string-api branch 2 times, most recently from ab39df2 to 6dd8a78 Compare December 7, 2023 17:55
Copy link

codecov bot commented Dec 7, 2023

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (7e4e0ea) 93.37% compared to head (47d5bfc) 93.37%.

Files Patch % Lines
src/main/connection.cpp 60.00% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2553   +/-   ##
=======================================
  Coverage   93.37%   93.37%           
=======================================
  Files        1063     1063           
  Lines       39403    39404    +1     
=======================================
+ Hits        36794    36795    +1     
  Misses       2609     2609           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@benjaminwinger benjaminwinger force-pushed the string-api branch 4 times, most recently from b427c9d to 789c5ad Compare December 7, 2023 20:25
std::ofstream file;
file.open(fileName);
file.open(std::string(fileName));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This api has already been replaced by 'copy to' statement, so you can remove it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

COPY ... TO (according to the docs) doesn't have a newline option, so it can't be used as a direct replacement in one of the tests (AlternateDelimCSVTest), though we could just change that test to expect \n as the newline character. Or maybe just remove those tests, or even update and add them to copy_to_csv.test.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to merge this as-is. We can discuss removing writeToCSV later.

@@ -226,29 +226,29 @@ struct UDF {
validateType<TR>(returnType);
scalar_exec_func scalarExecFunc = getScalarExecFunc<TR, Args...>(udfFunc, parameterTypes);
definitions.push_back(std::make_unique<function::ScalarFunction>(
name, std::move(parameterTypes), returnType, std::move(scalarExecFunc)));
std::string(name), std::move(parameterTypes), returnType, std::move(scalarExecFunc)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't make sense to use std::string_view in the API when we are going to construct a string regardless. In this case, we should take a std::string by value and let the user decide how to construct it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true. Though it brings up the question of do we want to prefer string_view anyway for #2521 (noting again that it's still not working in that context for other reasons), or should we take the more performant route to taking strings by value and moving them when we want ownership? Maybe we should go with the latter for the moment, but eventually move towards the former when and if the rest of what's required for #2521 is implemented?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I think we should just distribute a debug version of our windows DLL. Other approaches hurt everyone across the board.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#2521 was also about compiling from scratch though, and even the dynamic debug versions are quite large. On the other hand it shouldn't be that hard for someone linking against kuzu to link against different versions depending on the build mode, but it does take more time, work and disk space, as well as slows things down unnecessarily assuming they are debugging their code and not kuzu.

src/include/function/udf_function.h Outdated Show resolved Hide resolved
src/main/connection.cpp Outdated Show resolved Hide resolved
src/main/query_result.cpp Outdated Show resolved Hide resolved
tools/python_api/src_cpp/py_query_result.cpp Show resolved Hide resolved
tools/rust_api/src/database.rs Show resolved Hide resolved
@benjaminwinger benjaminwinger force-pushed the string-api branch 2 times, most recently from 47ec60a to 2fa5217 Compare December 19, 2023 17:01
@benjaminwinger benjaminwinger force-pushed the string-api branch 2 times, most recently from 617a8ff to 5d41bb6 Compare January 8, 2024 19:56
Also replaces some string & with string and std::move where appropriate
@benjaminwinger benjaminwinger merged commit 472c8b9 into master Jan 8, 2024
14 checks passed
@benjaminwinger benjaminwinger deleted the string-api branch January 8, 2024 22:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Windows: C++ ABI stability?
3 participants