Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSE version of PresetOutputs::PerPixelMath #62

Merged
merged 9 commits into from May 23, 2018

Conversation

Projects
None yet
4 participants
@mbellew
Copy link
Collaborator

mbellew commented May 20, 2018

tested using 320x240, yin - 315 - Ocean of Light (yo im peakin yo Eo.S.-Phat).milk just about doubled FPS from 60ish to 110ish.

From google-pprof, doesn't entirely match up with my FPS observation, but does show CPU usage going down.

before
881 48.6% 48.6% 881 48.6% __iscanonicall
453 25.0% 73.7% 1225 67.6% PresetOutputs::PerPixelMath
after
837 47.0% 47.0% 837 47.0% __iscanonicall
131 7.4% 54.4% 965 54.2% PresetOutputs::PerPixelMath_sse

@revmischa

This comment has been minimized.

Copy link
Collaborator

revmischa commented May 21, 2018

@mbellew

This comment has been minimized.

Copy link
Collaborator Author

mbellew commented May 21, 2018

I have an old iMac mini around somewhere I can blow the dust off and fire up...

@revmischa

This comment has been minimized.

Copy link
Collaborator

revmischa commented May 21, 2018

If it's easier you can make changes and push them and then look at travis. It takes 5-10 minutes to run through usually.

labkey-matthewb and others added some commits May 22, 2018

Matthew Bellew
*((void**)((size_t)ret - sizeof(void*))) = allocated;
return ret;
#else
void *mem = aligned_alloc( align, size );

This comment has been minimized.

@revmischa

revmischa May 22, 2018

Collaborator

is aligned_alloc available on EVERY other platform that isn't apple? raspi? BSD? windows?
I know mac is special and needs ^2 alignment, that's fine. i'm concerned about the aligned_alloc being portable.

This comment has been minimized.

@mbellew

mbellew May 22, 2018

Author Collaborator

I honestly don't know. It is C11 standard, and the only google references I found about lack of support seem related to OSX. I could remove the #ifdef and always use the hand-written version.

This comment has been minimized.

@revmischa

revmischa May 22, 2018

Collaborator

Well do whatever you think is best

This comment has been minimized.

@exp

exp May 22, 2018

MacOS seems to contain posix_memalign, would that not be sufficient? It seems to work (almost) exactly the same except takes an address to write the pointer to.

This comment has been minimized.

@mbellew

mbellew May 22, 2018

Author Collaborator

I'd rather use a library routine (posix_memalign and/or aligned_alloc), but I guess the real problem here is detecting which library routine is available in a general way. GCC is not my natural habitat so I'm definitely open to suggestions.

This comment has been minimized.

@mbellew

mbellew May 22, 2018

Author Collaborator

There's this from stackoverflow

https://stackoverflow.com/questions/16376942/best-cross-platform-method-to-get-aligned-memory

If STDC_VERSION >= 201112L use aligned_alloc.
If _POSIX_VERSION >= 200112L use posix_memalign.
If _MSC_VER is defined, use the Windows stuff.

This comment has been minimized.

@revmischa

revmischa May 22, 2018

Collaborator

autoconf should be able to detect and spit out a define in config.h
search http://download.redis.io/redis-stable/deps/jemalloc/configure.ac for aligned_alloc - maybe this is something like what we want?

AC_CHECK_FUNC([memalign],
	      [AC_DEFINE([JEMALLOC_OVERRIDE_MEMALIGN], [ ])
	       public_syms="${public_syms} memalign"])

again, do whatever you think is best. i just want to avoid breaking portability if it's not a huge pain.

This comment has been minimized.

@mbellew

mbellew May 22, 2018

Author Collaborator

Super, I haven't peeked in a configure.ac file before, but this looks like just the thing.

AC_CHECK_FUNCS_ONCE([aligned_alloc posix_memalign])

BTW, please crank up the mesh size and FPS and try test, everyone. I seem to see different results on different machines from dramatic to not-so-much. I want to make sure I'm not seeing things.

@revmischa

This comment has been minimized.

Copy link
Collaborator

revmischa commented May 22, 2018

I have one error compiling w/ xcode:

screen shot 2018-05-22 at 22 03 23

@revmischa revmischa merged commit c8e22e3 into projectM-visualizer:master May 23, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@mbellew mbellew deleted the mbellew:sse branch May 23, 2018

@revmischa revmischa referenced this pull request Sep 2, 2018

Closed

Optimization Projects #51

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.