Make simplecpp friendlier toward 64-bit builds.#288
Make simplecpp friendlier toward 64-bit builds.#288zlaski wants to merge 1 commit intocppcheck-opensource:masterfrom
Conversation
|
some CI warning |
| if (!tok || tok->str().size() != 1U) | ||
| return false; | ||
| const char c = std::tolower(tok->str()[0]); | ||
| const char c = (char)std::tolower(tok->str()[0]); |
There was a problem hiding this comment.
Hi, shouldn't this also be static_cast<char>( std::tolower( static_cast<unsigned char>( tok->str()[0] ) ) ); ?
If I remember correctly, it should be like that with this kind of function since tok->string() returns a std::string so the type of char[0] should be char
There was a problem hiding this comment.
I don't have a strong opinion but static_cast is pretty verbose in this case.
If we know that the first char ascii value is 0-127 then the cast is not needed. well I guess for garbage input the value could be out of range so the cast makes sense.
There was a problem hiding this comment.
I believe I added the casts to please the compiler. Like you said, your solution is verbose, and I'm not sure what it buys us (perhaps I just don't understand the intricacies of static_cast<...>). Also, since std::string is made up of elements of type char, it is intended to handle only ASCII values 0-127. If you want to handle values 0-255, I think it would be better to do something like
typedef std::basic_string<unsigned char> TokenString;
and then to change char to unsigned char throughout.
I guess a broader question is whether modern preprocessors should also handle UTF8/Unicode. I have absolutely no opinion on this.
There was a problem hiding this comment.
Also, since std::string is made up of elements of type char, it is intended to handle only ASCII values 0-127.
If utf-8 is used then on my machine since char is signed there will sometimes be negative values.
I don't know if there is anything unintended about that. I.e. as far as I know there is not undefined behavior. But maybe something in the conversion is undefined.
typedef std::basic_string TokenString;
ouch. I believe that would complicate the interface. For instance, Cppcheck reads the tokens from simplecpp and expects that TokenString can be converted directly to plain simple std::string.
There was a problem hiding this comment.
I see, yeah when this is looked at from a broader perspective the first cast to unsigned char does seem unnecessary, but it would not harm. The second cast (casting return value of tolower to char) would appease the MSVC compiler though and perhaps it could be added.
| } | ||
|
|
||
| void simplecpp::TokenList::lineDirective(unsigned int fileIndex, unsigned int line, Location *location) | ||
| void simplecpp::TokenList::lineDirective(std::size_t fileIndex, unsigned int line, Location *location) |
There was a problem hiding this comment.
it's rather excessive to use a 64-bit value for fileIndex. you might see 10-20 files in a translation unit but hardly millions of files..
There was a problem hiding this comment.
Yes, you're right. For some reason, I thought that fileIndex was a file offset.
| # else | ||
| # define SIMPLECPP_LIB | ||
| # endif | ||
| # pragma warning(disable: 4706) |
There was a problem hiding this comment.
Perhaps this could be put in its own block?
Similar to how nullptr is defined as NULL in this file.
|
I am not a fan about using std::size_t for various variables. Maybe we can shut off the MSVC warnings globally instead. |
This should have been now completely resolved. Thanks |
Use std::size_t to make build bit-width agnostic. On Windows, suppress warning about assignment in a conditional expression.