Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

200+ FPS (on an simple laptop) #86

Open
ker2x opened this issue Dec 7, 2022 · 12 comments
Open

200+ FPS (on an simple laptop) #86

ker2x opened this issue Dec 7, 2022 · 12 comments

Comments

@ker2x
Copy link
Contributor

ker2x commented Dec 7, 2022

i'm still busy refactoring and i'm currently relying on intel's oneAPI and TBB for multithreading.
But you can check the code here https://github.com/ker2x/particle-life/tree/oneapi-dpl/particle_life/src , and perhaps backport the modification to a normal compiler and normal lib. (or i'll dot it myself some day i guess).

It's not fully optimized yet but, notable change :

  • Using Vertex Buffer (vbo) instead of bruteforcing call to circle.
	void Draw(colorGroup group)
	{
		ofSetColor(group.color);
		vbo.setVertexData(group.pos.data(), group.pos.size(), GL_DYNAMIC_DRAW);
		vbo.draw(GL_POINTS, 0, group.pos.size());

	}
  • Using SOA instead of AOS. better possible vectorization, and it was needed to efficiently use VBO anyway
struct colorGroup {
	std::vector<ofVec2f> pos;
	std::vector<float> vx;
	std::vector<float> vy;
	ofColor color;
};
  • it should also allow to add more color more easily (i hope)

  • major cleanup of interaction code

void ofApp::interaction(colorGroup& Group1, const colorGroup& Group2, 
		const float G, const float radius, bool boundsToggle) const
{
	
	assert(Group1.pos.size() % 64 == 0);
	assert(Group2.pos.size() % 64 == 0);
	
	const float g = G / -100;	// attraction coefficient

//		oneapi::tbb::parallel_for(
//			oneapi::tbb::blocked_range<size_t>(0, group1size), 
//			[&Group1, &Group2, group1size, group2size, radius, g, this]
//			(const oneapi::tbb::blocked_range<size_t>& r) {

	for (size_t i = 0; i < Group1.pos.size(); i++)
	{
		float fx = 0;	// force on x
		float fy = 0;	// force on y
		
		for (size_t j = 0; j < Group2.pos.size(); j++)
		{
			const float distance = Group1.pos[i].distance(Group2.pos[j]);
			if ((distance < radius)) {
				const float force = 1 / std::max(std::numeric_limits<float>::epsilon(), distance);	// avoid dividing by zero
				fx += ((Group1.pos[i].x - Group2.pos[j].x) * force);
				fy += ((Group1.pos[i].y - Group2.pos[j].y) * force);
			}
		}

		// Wall Repel
		if (wallRepel > 0.0F)
		{
			if (Group1.pos[i].x < wallRepel) Group1.vx[i] += (wallRepel - Group1.pos[i].x) * 0.1;
			if (Group1.pos[i].x > boundWidth - wallRepel) Group1.vx[i] += (boundWidth - wallRepel - Group1.pos[i].x) * 0.1;
			if (Group1.pos[i].y < wallRepel) Group1.vy[i] += (wallRepel - Group1.pos[i].y) * 0.1;
			if (Group1.pos[i].y > boundHeight - wallRepel) Group1.vy[i] += (boundHeight - wallRepel - Group1.pos[i].y) * 0.1;
		}

		// Viscosity & gravity
		Group1.vx[i] = (Group1.vx[i] + (fx * g)) * (1.0 - viscosity);
		Group1.vy[i] = (Group1.vy[i] + (fy * g)) * (1.0 - viscosity) + worldGravity;
//		Group1.vx[i] = std::fmaf(Group1.vx[i], (1.0F - viscosity), std::fmaf(fx, g, 0.0F));
//		Group1.vy[i] = std::fmaf(Group1.vy[i], (1.0F - viscosity), std::fmaf(fy, g, worldGravity));

		//Update position
		Group1.pos[i].x += Group1.vx[i];
		Group1.pos[i].y += Group1.vy[i];
	}

	if (boundsToggle) {
		for (auto& p : Group1.pos)
		{
			p.x = std::min(std::max(p.x, 0.0F), static_cast<float>(boundWidth));
			p.y = std::min(std::max(p.y, 0.0F), static_cast<float>(boundHeight));
		}
	}	
}

i still have some crap to clean :)

  • using oneapi::parallel_invoke for parallelization
	oneapi::tbb::parallel_invoke(
		[&] { interaction(red,   red,   powerSliderRR, vSliderRR, boundsToggle); },
		[&] { interaction(red,   green, powerSliderRR, vSliderRG, boundsToggle); },
		[&] { interaction(red,   blue,  powerSliderRR, vSliderRB, boundsToggle); },
		[&] { interaction(red,   white, powerSliderRR, vSliderRW, boundsToggle); },
		[&] { interaction(green, red,   powerSliderGR, vSliderGR, boundsToggle); },
		[&] { interaction(green, green, powerSliderGG, vSliderGG, boundsToggle); },
		[&] { interaction(green, blue,  powerSliderGB, vSliderGB, boundsToggle); },
		[&] { interaction(green, white, powerSliderGW, vSliderGW, boundsToggle); },
		[&] { interaction(blue,  red,   powerSliderBR, vSliderBR, boundsToggle); },
		[&] { interaction(blue,  green, powerSliderBG, vSliderBG, boundsToggle); },
		[&] { interaction(blue,  blue,  powerSliderBB, vSliderBB, boundsToggle); },
		[&] { interaction(blue,  white, powerSliderBW, vSliderBW, boundsToggle); },
		[&] { interaction(white, red,   powerSliderWR, vSliderWR, boundsToggle); },
		[&] { interaction(white, green, powerSliderWG, vSliderWG, boundsToggle); },
		[&] { interaction(white, blue,  powerSliderWB, vSliderWB, boundsToggle); },
		[&] { interaction(white, white, powerSliderWW, vSliderWW, boundsToggle); }
	);

this is me slowly learning to use oneAPI and SYCL in order to offload all the parallel code to the GPU in the future (in a new project)

The biggest performance improvement come from the use of SOA and VBO.

@hunar4321
Copy link
Owner

Great job looking forward to it 👍 👍 💯

@KhadrasWellun
Copy link

It is not working for me

@ker2x
Copy link
Contributor Author

ker2x commented Dec 8, 2022

It is not working for me

yes. I'll try to make something mergeable with the main project, and independent of oneAPI.

@ker2x
Copy link
Contributor Author

ker2x commented Dec 19, 2022

it took me a while. the code is unfortunately much slower on MSVC than on intel compiler. But still faster than the previous version of course.

i also removed the dependency to intel TBB so no 200FPS (it can still be seen in commented code however)

@KhadrasWellun
Copy link

Hi! I want to set different particle sizes depending on their type. For example, red should be 1.0 pixels, green 1.2 pixels, blue 1.4 pixels, and so on. I saw in the code that a size is defined for all particles in ofApp.h. How could I condition this particle size with an "if"?
I'm referring to this piece of code:
void draw() const
{
ofSetColor(r, g, b, 100); //set particle color + some alpha
ofDrawCircle(x, y, 1.5F); //draws a point at x,y coordinates, the size of a 1.5 pixel circle
}
My colors are defined by generic names:
void ofApp::restart()
{
if (numberSliderα > 0) { alpha = CreatePoints(numberSliderα, 0, 0, ofRandom(64, 255)); }
if (numberSliderβ > 0) { betha = CreatePoints(numberSliderβ, 0, ofRandom(64, 255), 0); }
if (numberSliderγ > 0) { gamma = CreatePoints(numberSliderγ, ofRandom(64, 255), 0, 0); }
if (numberSliderδ > 0) { elta = CreatePoints(numberSliderδ, ofRandom(64, 255), ofRandom(64, 255), 0); }
if (numberSliderε > 0) { epsilon = CreatePoints(numberSliderε, ofRandom(64, 255), 0, ofRandom(64, 255)); }
if (numberSliderζ > 0) { zeta = CreatePoints(numberSliderζ, 0, ofRandom(64, 255), ofRandom(64, 255)); }
if (numberSliderη > 0) { eta = CreatePoints(numberSliderη, ofRandom(64, 255), ofRandom(64, 255), ofRandom(64, 255)); }
if (numberSliderθ > 0) { teta = CreatePoints(numberSliderθ, 0, 0, 0); }
}

I would like to define something like:
void draw() const
{
ofSetColor(r, g, b, 100); //set particle color + some alpha
if (numberSliderα > 0)
{
ofDrawCircle(x, y, 1.0F); //draw a point at x,y coordinates, the size of a 1.0 pixels
}
if (numberSliderβ > 0)
{
ofDrawCircle(x, y, 1.2F); //draw a point at x,y coordinates, the size of a 1.2 pixels
}
if (numberSliderγ > 0)
{
ofDrawCircle(x, y, 1.4F); //draw a point at x,y coordinates, the size of a 1.4 pixels
}
if (numberSliderδ > 0)
{
ofDrawCircle(x, y, 1.6F); //draw a point at x,y coordinates, the size of a 1.6 pixels
}
if (numberSliderε > 0)
{
ofDrawCircle(x, y, 1.8F); //draw a point at x,y coordinates, the size of a 1.8 pixels
}
if (numberSliderζ > 0)
{
ofDrawCircle(x, y, 2.0F); //draw a point at x,y coordinates, the size of a 2.0 pixels
}
if (numberSliderη > 0)
{
ofDrawCircle(x, y, 2.2F); //draw a point at x,y coordinates, the size of a 2.2 pixels
}
if (numberSliderθ > 0)
{
ofDrawCircle(x, y, 2.4F); //draw a point at x,y coordinates, the size of a 2.4 pixels
}
But Visual Studio gives me errors because these sliders are not defined here (they are defined in the GUI, class ofApp final : public ofBaseApp).
Please help me!

@ker2x
Copy link
Contributor Author

ker2x commented Dec 27, 2022

i assume your are referring to old codebase. Can you post a link to your code ?

the easiest way to do this would be to add a radius property to the point struct. and then you would just ofDrawCircle(x, y, radius)

@KhadrasWellun
Copy link

Here is my last version of the code.
Manuel_src.zip

@KhadrasWellun
Copy link

i assume your are referring to old codebase. Can you post a link to your code ?

the easiest way to do this would be to add a radius property to the point struct. and then you would just ofDrawCircle(x, y, radius)

I tried the new code, but it works extremely hard. I couldn't get more than 8 fps at 8 colors of 1000 particles each. Then, another shortcoming of the new code is that the particles look extremely small, like little dots where you can't really distinguish the color shades. The structures formed don't look good at all because of this.

@KhadrasWellun
Copy link

KhadrasWellun commented Dec 27, 2022

I also tried to insert a fullscreen button but it didn't work. I used ofToggleFullscreen(), but the screen kept blinking without showing anything.
I'm also trying to figure out how to introduce the 3D vision function (with 3D glasses).
Also, I don't know how to save the color palette generated before saving the model. The particle colors are generated when pressing the buttons that trigger the restart. But when I save the model, the existing colour palette is not saved so that I can load it later. Do you have any idea how I could save this color palette?

@ker2x
Copy link
Contributor Author

ker2x commented Dec 28, 2022

I'll take a look at your code, and also patch my code to allow drawing circle.
my code shouldn't be slower. this is weird. are you using openmp in your code ? I might have forgot to reenable it.

@ker2x
Copy link
Contributor Author

ker2x commented Dec 28, 2022

I'll check the fullscreen problem as well.

@KhadrasWellun
Copy link

KhadrasWellun commented Dec 28, 2022

I'll take a look at your code, and also patch my code to allow drawing circle. my code shouldn't be slower. this is weird. are you using openmp in your code ? I might have forgot to reenable it.

I took your exact code and just added more colors and those extra buttons. And after compiling, the code worked extremely hard. I had to set a small number of particles (under 1000 of each color) to go at 10 fps.

Here is your code with my additions in it in old way, that works fine with 17313 big particles at 61 fps:
src 1.7.6.5.zip

Here is a proof screenshot:
202301060731

Here is your code with my additions in it but in new way, that works slow and particles are very very small (near dots) with 9600 particles at 4 fps:
src 1.8.5.zip

Here is a proof screenshot:
202301060705

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants