Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the benchmark compile successfully by GCC #1

Merged
merged 2 commits into from
May 31, 2023

Conversation

PragmaTwice
Copy link
Contributor

@PragmaTwice PragmaTwice commented Dec 20, 2022

The benchmark CANNOT be compiled successfully by either GCC (12.2) or Clang (14.0).

This PR makes the benchmark pass gcc and run successfully.

BTW, the SSO inline size of std::string in libstdc++ is 15 now (22 in libc++, maybe),
so the default SSO size of SIMDString in the benchmark is not fair to be compared with std::string.

Also, the only use of SIMD in the whole project is a handwritten memcpy: this is strange,
because memcpy in glibc, for example, is highly optimized (handwritten assembly with SSE2/AVX2 instructions),
so I find it hard to believe that such a very trivial toy memcpy could be more efficient.

IMHO this project is more like an assignment from a random college student than a proven product in a company.

SIMDString.h Outdated
@@ -801,7 +801,7 @@ class
return append(c);
}
replace(pos - m_data, 0, 1, c);
return m_data + pos;
return pos;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pos here is an iterator (aka const char*), so it cannot be added to another const char* (m_data).

IMHO it is a logical error (and also a compile error).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been fixed as of 8495257

@vxw77
Copy link

vxw77 commented Dec 23, 2022

They also have a bug in the implementation #3

@MBkkt
Copy link

MBkkt commented Dec 24, 2022

If we talk about memcpy I think their main target is windows libc.
And on windows is common to replace it.
microsoft/mimalloc#201
But yes not sse :)

Also if we talk about glibc It can be worse for very short memcpy because of it's optimized for big sizes
https://research.google/pubs/pub50338/
Here some interesting and relatively new research about it

@linwansong
Copy link
Collaborator

linwansong commented May 31, 2023

BTW, the SSO inline size of std::string in libstdc++ is 15 now (22 in libc++, maybe), so the default SSO size of SIMDString in the benchmark is not fair to be compared with std::string.

The whole point is that we allow for a larger buffer size than std::string. Different string libraries are good for different use cases. As it says in the README.md

SIMDString is designed and benchmarked for the usage patterns found in games and other real-time 3D applications. It may perform less well and even underperform std::string for applications with different usage patterns.

If you want an exact apples to apples comparison, use SIMDString<16>.

Also, the only use of SIMD in the whole project is a handwritten memcpy: this is strange,

Since SSE instructions can only be performed on 16-byte aligned memory, we can only perform SSE memcpy on memory that we can guarantee is 16-byte aligned. SSE memcpy on heap storage is possible, but it requires that heap memory also be 16-byte aligned, which we're not requiring.

because memcpy in glibc, for example, is highly optimized (handwritten assembly with SSE2/AVX2 instructions),
so I find it hard to believe that such a very trivial toy memcpy could be more efficient.

To repeat what @MBkkt said, this isn't true for all versions of libc. memcpy would also run faster in this case because the buffer is 16-byte aligned.

@linwansong linwansong merged commit 8d73587 into RobloxResearch:main May 31, 2023
@luizpara
Copy link

Also, the only use of SIMD in the whole project is a handwritten memcpy: this is strange

@PragmaTwice you are absolutely right. And worse - that handwritten memcpy:

(1) is used only for short strings stored in the stack (not for the actually problematic ones, the ones allocated to the heap);
(2) it boils down to loading 2 char arrays within 2 __m128i (the copied array and the destination array), assigning elements of one array to the other and then assuming the compiler will vectorize.

IMHO this project is more like an assignment from a random college student than a proven product in a company.

This is indeed a very strange repository, full of bugs and based on a naive and trivial usage of SIMD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants