Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Easier implementation for software running on "normal" architectures #543

Closed
fcorbelli opened this issue Jul 20, 2021 · 10 comments
Closed

Comments

@fcorbelli
Copy link

I wish there was a simplified implementation of xxhash, in a single cpp file, without #ifdef, a ~6KB file instead of ~145KB

Something like this
https://create.stephan-brumme.com/xxhash/
but for XXH3 (the 128 bit version), just to get (snippet)

    XXH3_state_t state128;
    (void)XXH3_128bits_reset(&state128);
	
	size_t readSize;
	while ((readSize = fread(buffer, 1, blockSize, inFile)) > 0) 
	{
		(void)XXH3_128bits_update(&state128, buffer, readSize);
	}

In fact, I find it particularly difficult to debug other programs in which XXH3 is "integrated", it is really difficult to understand the interaction of all the #ifdefs.

Often you have "normal" environments (x86 / x64 machines) in which to use xxhash (actually a great program, when ... it works!)

I'm having trouble with a software that compiles and works fine with gcc on Windows and FreeBSD but not Linux, due to XXH3_state_t

Thank you

@Cyan4973
Copy link
Owner

I wish there was a simplified implementation of xxhash, in a single cpp file, without #ifdef, a ~6KB file

I wish it too.
I like Stefan's variants.

Problem is, it would violate one of the rules of this repository, which is "one xxhash.h file for all", on the ground that it's easier to integrate.

The objective is not all lost though. We could consider smaller specialized files, which are then aggregated into a single xxhash.h file. This way, both approaches (small specialized files + single file for all) would be possible.

However, this is a complex goal, and I don't see that happening on short term.
For your immediate needs, better consider a fix.

I'm having trouble with a software that compiles and works fine with gcc on Windows and FreeBSD but not Linux, due to XXH3_state_t

Could you please describe your issue ?

XXH3 is supposed to work fine on Linux.
It's actually continuously tested on Linux.

@fcorbelli
Copy link
Author

It's a 64-bit alignment issue with the gcc 10.3 compiler (Ubuntu's 21 default) and the -O3 option.

You can find the ongoing thread here
https://encode.su/threads/3664-C-expert-(on-Linux)-needed-with-xxhash

Short version: if into a struct that do a lot of other things you insert a XXH3_state_t (that should be 64 byte aligned) very nasty things happens turning on optimizations.

struct DT 
{
	int64_t date;          	// decimal YYYYMMDDHHMMSS (UT) or 0 if deleted
	int64_t size;          	// size or -1 if unknown
	int64_t attr;          	// first 8 attribute bytes
	int64_t data;          	// sort key or frags written. -1 = do not write
	vector<unsigned> ptr;  	// fragment list
	vector<DTV> dtv;       	// list of versions
	int written;           	// 0..ptr.size() = fragments output. -1=ignore

	string hexhash;			// for new functions, not for add

	char 			franz_block[FRANZOFFSETSHA256];
	uint32_t 		file_crc32;
	XXH3_state_t 	file_xxh3; /////// <<<<====================== align!!!!!
	libzpaq::SHA256 file_sha256;
	libzpaq::SHA1 	file_sha1;
	
	DT(): date(0), size(0), attr(0), data(0),written(-1),file_crc32(0) {memset(franz_block,0,sizeof(franz_block));hexhash="";(void)XXH3_128bits_reset(&file_xxh3);}
};

Even if you try to align the member, the entire struct, or even typedefs of maps (!) to 64-byte.

Disabling XXH alignment does not seems to work either, I'm still working on it.

Running with -fsanitize=undefined show the behavior

Thanks anyway for the answer

PS If you DO NOT "inject" a XXH variables into a struct, but into code, works fine with this compiler.
PS/2 clang does not like at all, too (different alignment error)

@fcorbelli
Copy link
Author

Current workaroud: using this
https://github.com/embeddedartistry/...lloc_aligned.c

=> "manually" allocating

struct DT 
{
    int64_t date;              // decimal YYYYMMDDHHMMSS (UT) or 0 if deleted
    int64_t size;              // size or -1 if unknown
    
(...)
    XXH3_state_t    *pfile_xxh3;  // this is the problem: XXH3's 64-byte align not always work with too old-too new compilers
(...)    
    DT(): date(0), size(0), attr(0), data(0),written(-1),file_crc32(0),file_xxhash64(0) {memset(franz_block,0,sizeof(franz_block));hexhash="";
    
/// big waste of time and space, but now we are sure to maintain 64 byte alignment
    pfile_xxh3=(XXH3_state_t*)aligned_malloc(64, sizeof(XXH3_state_t));
    (void)XXH3_128bits_reset(pfile_xxh3);
        }
};```

@Cyan4973
Copy link
Owner

That looks like a correct workaround.
Note that XXH3_createState() is essentially doing the same thing, guaranteeing that the allocated space for the state is correctly aligned.

Indeed, aligning a member of a struct to 64-bytes has consequences for the encompassing structure.
The rules regarding structure alignment are pretty clear, so it should be a straightforward application.
However, given that aligning on 64-bytes is not common, this could be a case which is not tested, hence could be missed.
Given that the issue happens on a specific compiler version, on a specific OS, using a specific set of flags,
a compiler bug issue is not excluded at this stage.

@felixhandte
Copy link
Contributor

felixhandte commented Jul 21, 2021

C11 says:

If an object's alignment is made stricter (larger) than max_align_t using _Alignas, it has extended alignment requirement. A struct or union type whose member has extended alignment is an over-aligned type. It is implementation-defined if over-aligned types are supported, and their support may be different in each kind of storage duration.

Which, if I'm reading correctly, is not particularly encouraging. It looks like implementations are free to ignore alignas(64)...

I'm not sure if there's a good solution available, but one small possible improvement would be: is it possible the XXH_ALIGN() macro does not interact well with C++? It only uses alignas() when it detects C11, but doesn't check for C++11. We should consider adding || (defined(__cplusplus) && (__cplusplus >= 201103L).

@Cyan4973
Copy link
Owner

Cyan4973 commented Jul 21, 2021

Yes that's a good suggestion.
I did not realized that the issue would be triggered when compiling the source code in C++ mode.

@t-mat
Copy link
Contributor

t-mat commented Jul 22, 2021

As for C++, please use operator new and operator delete.

The following code is minimal repro and solution of this issue.

#define XXH_STATIC_LINKING_ONLY // access advanced declarations
#define XXH_IMPLEMENTATION      // access definitions
#include "../xxhash.h"
#include <stdio.h>              // printf()

static const size_t alignment = 64;

struct Issue543 {
    char placeHolder[1];
    XXH3_state_t xxh3;

    void test() {
        const uintptr_t pThis = reinterpret_cast<uintptr_t>(this);
        const uintptr_t pXxh3 = reinterpret_cast<uintptr_t>(&this->xxh3);
        printf("struct Issue543: ptr = %p", this);
        printf(", this %% %zd = %2zd",       alignment, pThis % alignment);
        printf(", &this->xxh %% %zd = %2zd", alignment, pXxh3 % alignment);
        printf("\n");
    }
};

struct Fix543 {
    char placeHolder[1];
    XXH3_state_t xxh3;

    static void* operator new(std::size_t sz) {
        void* p = aligned_alloc(alignment, sz);  // You can use std::aligned_alloc() with C++17.
        printf("struct Fix543: custom new for size = %zd, ptr=%p\n", sz, p);
        return p;
    }

    void operator delete(void* ptr) noexcept {
        free(ptr);
    }
    
    void test() {
        const uintptr_t pThis = reinterpret_cast<uintptr_t>(this);
        const uintptr_t pXxh3 = reinterpret_cast<uintptr_t>(&this->xxh3);
        printf("struct Fix543: ptr = %p", this);
        printf(", this %% %zd = %2zd",       alignment, pThis % alignment);
        printf(", &this->xxh %% %zd = %2zd", alignment, pXxh3 % alignment);
        printf("\n");
    }
};

int main(int argc, char* argv[]) {
    for(int i = 0; i < 8; ++i) {
        auto* p = new Issue543();
        p->test();
        // note : we do not delete p for test.
    }

    for(int i = 0; i < 8; ++i) {
        auto* p = new Fix543();
        p->test();
        // note : we do not delete p for test.
    }
}

/*
$ g++ --version
g++ (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0

$ g++ -O3 issue_543.cpp && ./a.out
struct Issue543: ptr = 0x55acd6e45eb0, this % 64 = 48, &this->xxh % 64 = 48
struct Issue543: ptr = 0x55acd6e46550, this % 64 = 16, &this->xxh % 64 = 16
struct Issue543: ptr = 0x55acd6e467e0, this % 64 = 32, &this->xxh % 64 = 32
struct Issue543: ptr = 0x55acd6e46a70, this % 64 = 48, &this->xxh % 64 = 48
struct Issue543: ptr = 0x55acd6e46d00, this % 64 =  0, &this->xxh % 64 =  0
struct Issue543: ptr = 0x55acd6e46f90, this % 64 = 16, &this->xxh % 64 = 16
struct Issue543: ptr = 0x55acd6e47220, this % 64 = 32, &this->xxh % 64 = 32
struct Issue543: ptr = 0x55acd6e474b0, this % 64 = 48, &this->xxh % 64 = 48
struct Fix543: custom new for size = 640, ptr=0x55acd6e47740
struct Fix543: ptr = 0x55acd6e47740, this % 64 =  0, &this->xxh % 64 =  0
struct Fix543: custom new for size = 640, ptr=0x55acd6e47a40
struct Fix543: ptr = 0x55acd6e47a40, this % 64 =  0, &this->xxh % 64 =  0
struct Fix543: custom new for size = 640, ptr=0x55acd6e47d40
struct Fix543: ptr = 0x55acd6e47d40, this % 64 =  0, &this->xxh % 64 =  0
struct Fix543: custom new for size = 640, ptr=0x55acd6e48040
struct Fix543: ptr = 0x55acd6e48040, this % 64 =  0, &this->xxh % 64 =  0
struct Fix543: custom new for size = 640, ptr=0x55acd6e48340
struct Fix543: ptr = 0x55acd6e48340, this % 64 =  0, &this->xxh % 64 =  0
struct Fix543: custom new for size = 640, ptr=0x55acd6e48640
struct Fix543: ptr = 0x55acd6e48640, this % 64 =  0, &this->xxh % 64 =  0
struct Fix543: custom new for size = 640, ptr=0x55acd6e48940
struct Fix543: ptr = 0x55acd6e48940, this % 64 =  0, &this->xxh % 64 =  0
struct Fix543: custom new for size = 640, ptr=0x55acd6e48c40
struct Fix543: ptr = 0x55acd6e48c40, this % 64 =  0, &this->xxh % 64 =  0
 */

Also, I'd like to note that it seems recent version of libstdc++ already fixed this issue. Please add -std=c++17 to gcc/clang command line switch.

g++ -std=c++17 -O3 issue_543.cpp && ./a.out

Default allocator (struct Issue543) returns nicely aligned memory pointer 🎉

$ g++ -std=c++17 -O3 issue_543.cpp && ./a.out
struct Issue543: ptr = 0x56387bb03f00, this % 64 =  0, &this->xxh % 64 =  0
struct Issue543: ptr = 0x56387bb045c0, this % 64 =  0, &this->xxh % 64 =  0
struct Issue543: ptr = 0x56387bb048c0, this % 64 =  0, &this->xxh % 64 =  0
struct Issue543: ptr = 0x56387bb04bc0, this % 64 =  0, &this->xxh % 64 =  0
struct Issue543: ptr = 0x56387bb04ec0, this % 64 =  0, &this->xxh % 64 =  0
struct Issue543: ptr = 0x56387bb051c0, this % 64 =  0, &this->xxh % 64 =  0
struct Issue543: ptr = 0x56387bb054c0, this % 64 =  0, &this->xxh % 64 =  0
struct Issue543: ptr = 0x56387bb057c0, this % 64 =  0, &this->xxh % 64 =  0

@gzm55
Copy link
Contributor

gzm55 commented Jul 22, 2021

Yes that's a good suggestion.
I did not realized that the issue could be triggered when compiling the source code in C++ mode.

align macro:

/* align macros */
#if defined (__STDC_VERSION__) && (__STDC_VERSION__ >= 201112L)   /* C11+ */
#  include <stdalign.h>
#  define NMH_ALIGN(n)      alignas(n)
#elif defined(__GNUC__)
#  define NMH_ALIGN(n)      __attribute__ ((aligned(n)))
#elif defined(_MSC_VER)
#  define NMH_ALIGN(n)      __declspec(align(n))
#else
#  define NMH_ALIGN(n)   /* disabled */
#endif

@fcorbelli
Copy link
Author

@ new and delete
https://encode.su/threads/3664-C-expert-(on-Linux)-needed-with-xxhash?p=70372&viewfull=1#post70372
a matter of habit

aligned_alloc (or whatever) does not always exists (in fact, I used this as a workaround/fix https://github.com/embeddedartistry/embedded-resources/blob/master/examples/c/malloc_aligned.c
neither -std=c++17 is feasible (happen to compile my project for non-Intel NAS and vSphere servers and others "strange" things)

So I would like to ask a question about alignment as a mandatory prerequisite for XXH3.

Is it possible to eliminate at all (optionally), of course "paying" a certain performance penalty?

I only did a few quick (failed) tests as I ran into a concomitant g++ (!) bug that took me a long time

Thanks for all reply

PS The state is big, very big (hundreds of bytes) so, in fact, a "on demand" allocation it's an improvement for my software (sometimes it allocates 500,000 or even a few million structures), not all evils come to harm

@t-mat
Copy link
Contributor

t-mat commented Jul 22, 2021

If you can use accessor method and sacrifice 64 bytes, it allows you to introduce manual alignment.

struct Fix543_2 {
    static const size_t align = 64;

    char placeHolder[1];
    char xxh3_buf[sizeof(XXH3_state_t) + align];

    static uintptr_t computeXxh3AlignedAddr(uintptr_t base) {
        // Adopted from XXH_alignedMalloc() as an example.
        size_t offset = align - ((size_t)base & (align - 1));
        assert(offset <= align);
        return base + offset;
    }

    XXH3_state_t* xxh3State() {
        const uintptr_t p0 = reinterpret_cast<uintptr_t>(&this->xxh3_buf[0]);
        const uintptr_t p1 = computeXxh3AlignedAddr(p0);
        return reinterpret_cast<XXH3_state_t*>(p1);
    }

//  const XXH3_state_t* xxh3State() const { ... }

    void test() {
        const uintptr_t pThis = reinterpret_cast<uintptr_t>(this);
        const uintptr_t pXxh3 = reinterpret_cast<uintptr_t>(xxh3State());
        printf("struct Fix543_2: ptr = %p", this);
        printf(", alignof(*this) = %2zd",    alignof(*this));
        printf(", this %% %zd = %2zd",       alignment, pThis % alignment);
        printf(", &this->xxh %% %zd = %2zd", alignment, pXxh3 % alignment);
        printf("\n");
    }
};
/*
struct Fix543_2: ptr = 0x55bf02971f40, alignof(*this) =  1, this % 64 =  0, &this->xxh % 64 =  0
struct Fix543_2: ptr = 0x55bf029721d0, alignof(*this) =  1, this % 64 = 16, &this->xxh % 64 =  0
struct Fix543_2: ptr = 0x55bf02972460, alignof(*this) =  1, this % 64 = 32, &this->xxh % 64 =  0
struct Fix543_2: ptr = 0x55bf029726f0, alignof(*this) =  1, this % 64 = 48, &this->xxh % 64 =  0
struct Fix543_2: ptr = 0x55bf02972980, alignof(*this) =  1, this % 64 =  0, &this->xxh % 64 =  0
struct Fix543_2: ptr = 0x55bf02972c10, alignof(*this) =  1, this % 64 = 16, &this->xxh % 64 =  0
struct Fix543_2: ptr = 0x55bf02972ea0, alignof(*this) =  1, this % 64 = 32, &this->xxh % 64 =  0
struct Fix543_2: ptr = 0x55bf02973130, alignof(*this) =  1, this % 64 = 48, &this->xxh % 64 =  0
*/

Is it possible to eliminate at all (optionally), of course "paying" a certain performance penalty?

With the following conditions, I think it's doable.

  • Define XXH_STATIC_LINKING_ONLY and XXH_VECTOR as XXH_SCALAR before including xxhash.h,

For example, add the following #if and #endif pair to xxhash.h.

// xxhash.h
+   #if ! defined(XXH_ALIGN)
    #if defined (__STDC_VERSION__) && (__STDC_VERSION__ >= 201112L)   /* C11+ */
    #  include <stdalign.h>
    #  define XXH_ALIGN(n)      alignas(n)
    ...
    #  define XXH_ALIGN(n)   /* disabled */
    #endif
+   #endif

Compile and run the following code with and without XXH_VECTOR and XXH_ALIGN.

#define XXH_STATIC_LINKING_ONLY // access advanced declarations
#define XXH_IMPLEMENTATION      // access definitions
#include "../xxhash.h"          // modified version of xxhash.h
#include <stdio.h>              // printf()

struct MyStruct {
    char placeHolder[1];
    XXH3_state_t xxh3;
    XXH3_state_t* xxh3State() { return &xxh3; }
};

int main(int argc, char* argv[]) {
    auto* p = new MyStruct();
    XXH3_state_t* state = p->xxh3State();

    uint32_t seed = 0;

    char buffer[] = "Test";
    size_t bufferSize = sizeof(buffer);

    XXH3_128bits_reset_withSeed(state, (XXH64_hash_t)seed);
    XXH3_128bits_update(state, buffer, bufferSize);
    uint32_t digest = (XXH3_128bits_digest(state).low64);

    printf("XXH_VECTOR=%d\n", XXH_VECTOR);
    printf("digest=0x%08x\n", digest);
    printf("alignof(XXH3_state_t) = %zd\n", alignof(XXH3_state_t));

    delete p;
}
$ g++ -O3 xxh3_alignment.cpp && ./a.out
XXH_VECTOR=1
digest=0x93a401e0
alignof(XXH3_state_t) = 64

$ g++ -O3 xxh3_alignment.cpp -D XXH_VECTOR=XXH_SCALAR -D'XXH_ALIGN(x)=' && ./a.out
XXH_VECTOR=0
digest=0x93a401e0
alignof(XXH3_state_t) = 8

felixhandte added a commit to felixhandte/xxHash that referenced this issue Jul 22, 2021
The previous macro test only detected C11 and failed in modern C++, which
actually goes one step further and makes `alignas` a keyword. It's not clear
that this actually improves the situation with respect to Cyan4973#543, but it should
be slightly more correct in some sense.
@Cyan4973 Cyan4973 closed this as completed Aug 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants