New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
decoded view #229
decoded view #229
Conversation
@akrzemi1, would you like to chip in? |
include/boost/url/decoded_view.hpp
Outdated
namespace boost { | ||
namespace urls { | ||
|
||
/** A view on a string as percent decoded characters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've improved this (locally for now) but I'll let you let me know when we can resolve this.
include/boost/url/decoded_view.hpp
Outdated
return n_ == 0; | ||
} | ||
|
||
/** Copies characters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've improved this (locally for now) but I'll let you let me know when we can resolve this.
667063a
to
e18e113
Compare
An automated preview of the documentation is available at https://229.url.prtest.cppalliance.org/libs/url/doc/html/index.html |
e18e113
to
e8cc3fd
Compare
An automated preview of the documentation is available at https://229.url.prtest.cppalliance.org/libs/url/doc/html/index.html |
namespace detail | ||
{ | ||
template<class T, class = void> | ||
struct has_value_type : std::false_type {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could have done this all in is_mutable_string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to do that but I couldn't because of some other implementation details.
detail::has_value_type<T>::value && | ||
detail::has_assign_and_append<T, I>::value | ||
>::type> | ||
: std::is_same<typename T::value_type, char> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh you can do better than this :) do it all in is_mutable_string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything was in is_mutable_string
, but I didn't manage to include the value_type
requirement without splitting it.
template<class T, class I> | ||
using is_mutable_string = __see_below__; | ||
#else | ||
namespace detail |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't put detail stuff in headers that docca sees
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the detail
namespace was excluded in Jamfile. In any case, this is ifdef
ed out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the detail/ directory is excluded by the jamfile but there's no way to exclude the namespace from being parsed. docca throws them out, but it still has to lexically analyze it. Better that doxygen never sees it, so that the docs generate faster.
test/unit/snippets.cpp
Outdated
struct MutableString | ||
{ | ||
template< class InputIt > | ||
MutableString& assign( InputIt first, InputIt last ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need the requirement that the function return T&
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I replaced MutableString&
with void
. But that would imply the requirement is void
. Both are wrong, but MutableString&
is more common although void might be safer (? not sure).
e8cc3fd
to
f3e9885
Compare
An automated preview of the documentation is available at https://229.url.prtest.cppalliance.org/libs/url/doc/html/index.html |
You could also say there is no plain string (there's only a view over pct-encoded storage). Again, I think "[pct_]decoding_view" would be clear? UPDATE W.r.t. naming, I agree that it's never going to read like poetry, and that's probably fine. I suppose |
f3e9885
to
1d1a09f
Compare
An automated preview of the documentation is available at https://229.url.prtest.cppalliance.org/libs/url/doc/html/index.html |
Context (
const_string
)We have been seeing many foundational issues with
const_string
, including the recent issues #159 and #223. When compared tostd::string
orstd::string_view
, its design has been becoming more complex over time (#163, #164) and we are constantly exploring new solutions around it (#112, #127).Some of its problems can be improved by implementing it as a concrete type with the appropriate operators (#127). However, while recently exploring examples of use cases of URIs (#226, #224), we have identified problems that cannot be solved with these extra operators.
For instance, routers almost never need to allocate memory and convert decoded values to
std::string
.const_string
was forcing the application to perform ~2 or 3 allocations in a context where no allocations were really required, not to mention the lack of ergonomics in handling static_pools to avoid these allocations. Althoughconst_string
contains an inline buffer, the user still needs to handle these potential allocations.Another smaller problem is mentioned in the mailing list. A
url_view
has functions such asurl_view::XXXXXX()
andurl_view::encoded_XXXXXX()
for all URI components, where the default functionurl_view::XXXXXX()
is the expensive one. Some people have found this counterintuitive. Other people have considered only using the functions for the encoded components, which could defy the purpose of the library to a large extent.This PR (
decoded_view
)This PR introduces a "transform-view" type for percent decoded strings. Thus, all pairs of functions for URI components (
url_view::XXXXXX()
andurl_view::encoded_XXXXXX()
) would return cheap reference types. The user can use both functions without worrying about unnecessary memory allocations.In cases where memory allocation is required, the allocation is deferred until the user really needs it. The user has total control over how the memory will be allocated. It also avoids an extra allocation when
const_string
would be converted tostd::string
and avoids all allocations when the decoded string is only required for comparisons and checking conditions. When the user needs common conversions, such asstd::string
, helper functions such asto_string()
are provided for new and recycled objects.The view type has the usual string operators, and should be almost as convenient as a
string_view
. The main difference is iterators are bidirectional, sooperator[]
is not available. One would need to advance the iterator or allocate memory. In practice, we don't needoperator[]
for many common use cases, like comparing segments or query values.A summary of the allowed operations:
begin
,cbegin
,end
,cend
,rbegin
,crbegin
,rend
,crend
front
,back
,encoded_data
size
,length
,encoded_size
,encoded_length
,max_size
,empty
copy
,compare
,to_string
,append_to
std::basic_string
oeprator<<(ostream, decoded_view)
About the operations:
string_view
are possible with the same time complexity:starts_with
,ends_with
,contains
,find
,find_*
string_view
are possible with the linear time complexity, which would still be faster than allocating memory:substr
,remove_prefix
,remove_suffix
Tests
Besides testing all member functions, as usual, the tests include use cases people have brought up (
testConversion()
andtestAppend()
), snippets that came up in the examples (mostly intestCompare()
), and use cases analogous to all examples in the Tony Table of PR #127 (testPR127Cases()
).Naming
The initial name for this class is
decoded_view
.The suffix
view
indicates it doesn't own the data. The suffixrange
could denote both views and containers, which makes the suffixrange
useless.The prefix
decoded
is short and, in the context, is enough to convey that the string is percent decoded. So this avoids very long unergonomic names such aspercent_decoded_range
or evenpct_decoded_range
.