Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor formatting strings #13212

Merged
merged 33 commits into from
Dec 4, 2020
Merged

Conversation

IntelOrca
Copy link
Contributor

Eradicate all coded format tokens. All internal strings now contain raw tokens such as {STRINGID} and {RED}. New iterators have been created to iterate the tokens and the UTF-8 codepoints.

Formatting strings has been re-written and a new template version is available.

@IntelOrca IntelOrca force-pushed the new-format-string branch 4 times, most recently from 6eeceda to 88281c4 Compare October 19, 2020 21:59
@IntelOrca IntelOrca force-pushed the new-format-string branch 2 times, most recently from 55ab7d5 to 85bde11 Compare October 19, 2020 23:33
@tupaschoal
Copy link
Member

Do we want this in before the next version is released Ted?

@duncanspumpkin
Copy link
Contributor

@tupaschoal this is bound to produce some bugs, I would say no.

@IntelOrca
Copy link
Contributor Author

@tupaschoal there is no hurry for this

@tupaschoal tupaschoal added this to the After v0.3.2 milestone Oct 22, 2020
@tupaschoal
Copy link
Member

Yep, I though so, just wanted to confirm

@IntelOrca IntelOrca marked this pull request as ready for review October 22, 2020 22:03
@IntelOrca
Copy link
Contributor Author

I will apply new naming / casing suggestions.

@IntelOrca IntelOrca force-pushed the new-format-string branch 3 times, most recently from 7747a44 to cac89e2 Compare November 1, 2020 22:33
Copy link
Member

@tupaschoal tupaschoal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave up on 3def53a because I was dozing

src/openrct2/core/String.hpp Outdated Show resolved Hide resolved
Comment on lines 82 to 85
auto result = std::find_if(std::begin(format_code_tokens), std::end(format_code_tokens), [token](auto& fct) {
return String::Equals(token, fct.token, true);
});
return result != std::end(format_code_tokens) ? result->code : 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use cbegin and cend?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does cbegin and cend do?

uint32_t format_get_code(const char* token);
#include <string_view>

uint32_t format_get_code(std::string_view token);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const &? I don't think string views need them, but because you used above, I'm vouching for consistency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ZehMatt said to make all string_view not const &.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, it's just that there are other places in the diff doing const& (not sure if in the latest revision though)

src/openrct2/localisation/Formatting.cpp Outdated Show resolved Hide resolved
size_t num;
if (value < 0)
{
// TODO handle edge case: std::numeric_limits<int64_t>::min();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handle it?

Comment on lines +109 to +224
auto actual = FormatString("Train is going at {VELOCITY}.", 1024);
ASSERT_EQ("Train is going at 1,648 km/h.", actual);
}

TEST_F(FormattingTests, velocity_mps)
{
gConfigGeneral.measurement_format = MeasurementFormat::SI;
auto actual = FormatString("Train is going at {VELOCITY}.", 1024);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were probably not passing (and got fixed later in the PR?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what you are saying. Does this test fail for you?

Comment on lines +29 to +32
std::string_view _str;
std::string _strOwned;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use use TitleCase for member variables?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is lower camel case, the convention for private fields.

if (value < 0)
{
ss << '-';
value = -value;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::abs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we already know that it is negative, abs is unnecessary.

Comment on lines 35 to 36
std::string_view _str;
size_t _index;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TitleCase?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't title case.

class CodepointView
{
private:
std::string_view _str;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use use TitleCase for member variables?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't title case.

@tupaschoal tupaschoal removed this from the After v0.3.2 milestone Nov 4, 2020
@ZehMatt
Copy link
Member

ZehMatt commented Nov 5, 2020

I don't quite understand all the thread_local use, when std::string is a local variable it should be thread safe. Maybe I'm missing something about the implementation?

src/openrct2-ui/windows/TitleEditor.cpp Outdated Show resolved Hide resolved
src/openrct2-ui/windows/TitleEditor.cpp Outdated Show resolved Hide resolved
src/openrct2/core/String.cpp Outdated Show resolved Hide resolved
@@ -154,6 +154,32 @@ namespace String
}
}

bool Equals(const std::string_view& a, const std::string_view& b, bool ignoreCase)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bool Equals(const std::string_view& a, const std::string_view& b, bool ignoreCase)
constexpr bool Equals(const std::string_view a, const std::string_view b, bool ignoreCase)
{
using CharType = std::string_view::value_type;
auto cmp = ignoreCase ? [](CharType a, CharType b) {
return std::tolower(a) == std::tolower(b);
} : [](CharType a, CharType b) {
return a == b;
};
return std::equal(a.begin(), a.end(), b.begin(), cmp);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not compile

src/openrct2/drawing/TTF.cpp Outdated Show resolved Hide resolved
src/openrct2/drawing/TTF.cpp Outdated Show resolved Hide resolved
src/openrct2/interface/StdInOutConsole.cpp Outdated Show resolved Hide resolved
@IntelOrca
Copy link
Contributor Author

IntelOrca commented Nov 5, 2020

@ZehMatt Formatting calls are called from drawing calls. Drawing calls can be multi-threaded, so I am using threadlocal to ensure it doesn't share the same buffer on the heap. I am avoiding fixed static buffers because we are trying to move away from fixed numbers which may be exceeded in some scenarios or otherwise just hog stack space.

That is my reason for making it thread local over static. The reason I don't make it a local variable is to avoid memory allocation per draw frame.

Over time I would like to clean up all these functions and make them use formatter and other routines more directly which would then use a more shared thread local buffer.

@ZehMatt
Copy link
Member

ZehMatt commented Nov 5, 2020

None of the buffers should be shared, the implementation should be thread-safe by design and not by choice. I understand the argument about allocations but the stringstream isn't helping here. The current thing with the drawing right now is that it shares a global variable as buffer which ofc requires it to be either thread_local or locking, but the goal of the Formatter was to eradicate this kind of usage. Also often when the strings are small you don't have to worry about allocations due to small string optimization.

@IntelOrca
Copy link
Contributor Author

@ZehMatt how do you suggest it should be done then if we want to avoid memory allocation every single frame and avoid fixed length stack buffers?

@ZehMatt
Copy link
Member

ZehMatt commented Nov 6, 2020

In one of my projects I do something like this

        template<typename... Args> void formatMsg(MsgType type, const char* fmt, Args&&... args)
        {
            char buffer[128];
            // Use local buffer first, for 90% of the time this is enough space.
            int actualLength = snprintf(buffer, sizeof(buffer), fmt, std::forward<Args&&>(args)...);
            if (actualLength >= sizeof(buffer))
            {
                // snprintf_s returns the size of the completely formatted string, if bigger than buffer
                // it was truncated allocate then temporary buffer.
                std::unique_ptr<char[]> tempBuf(new char[actualLength + 1]);
                _snprintf_s(tempBuf.get(), actualLength + 1, actualLength, fmt, std::forward<Args&&>(args)...);
                tempBuf[actualLength] = '\0';
                return printMsg(type, tempBuf.get());
            }
            return printMsg(type, buffer);
        }

which covers 90% of all the things going in. Also aren't the current global buffers fixed size? We can probably measure what the maximum/minimum/average is and just allocate for the worst case like above.

@IntelOrca
Copy link
Contributor Author

IntelOrca commented Nov 6, 2020

My new global buffer is a stringstream or string which is cleared and re-used each frame, so it can grow to any size. I would like to avoid a fixed length, I would like 100% of cases to be correct, not just 90%.

If you allocate a temp buffer for this one really long string, you are still going to get that every tick that long string might be formatted.

@ZehMatt
Copy link
Member

ZehMatt commented Nov 6, 2020

My new global buffer is a stringstream or string which is cleared and re-used each frame, so it can grow to any size. I would like to avoid a fixed length, I would like 100% of cases to be correct, not just 90%.

If you allocate a temp buffer for this one really long string, you are still going to get that every tick that long string might be formatted.

I don't know what to say, having dynamic strings comes with a cost, thread_local also comes with a cost, its probably not as expensive as an allocation but its not free either. I'm not sure why you want dynamic strings, the localisation is a fixed data set, the only exceptions for this is user supplied strings as arguments which is not often really.

Copy link
Member

@Gymnasiast Gymnasiast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, all my code-related questions/remarks have been solved or answered.

I still want to test this, though.

Copy link
Member

@Gymnasiast Gymnasiast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Banners from RCT1 are completely broken:

afbeelding

The colours are not imported correctly, there are large blank gaps and one sign even says "palelavender".

Edit: so are banners from RCT2 parks. E.g. the no entry banner on Judge Roy Scream from SF over Texas. Hovering banners also does not show the colours correctly, even on signs that otherwise work.

Copy link
Member

@Gymnasiast Gymnasiast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the banner bugs are gone, but the tooltip is still not quite correct:

afbeelding

It should be in the colour of the banner text. In above case, it should be pale lavender.

Additionally, the banner on the left should display "Exit", but displays "Exitnce" (I made sure to check in RCT1).

afbeelding

Both issues are in Blackpool Pleasure Beach.

afbeelding

And the name of Judge Roy Scream (SF over Texas) also contains garbage.

Looks like a buffer is not cleared correctly.

@IntelOrca
Copy link
Contributor Author

Issues were due to my last fix not being particularly good enough. I have now resolved those issues.

Copy link
Member

@Gymnasiast Gymnasiast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropdown lists lack the padding on the left, and are missing the marker that denotes the selection:

afbeelding

Copy link
Member

@Gymnasiast Gymnasiast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above error appears to be the result of an error in nl-NL.txt that did not surface before (it worked by coincidence). It should be fixed in Localisation.

Saving and loading appears to work properly.

@IntelOrca IntelOrca merged commit d58d834 into OpenRCT2:develop Dec 4, 2020
@tupaschoal tupaschoal added this to the v0.3.3 milestone Dec 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants