Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance bottlenecks in iOS and android #110

Open
huachangmiao opened this issue Apr 9, 2018 · 71 comments
Open

performance bottlenecks in iOS and android #110

huachangmiao opened this issue Apr 9, 2018 · 71 comments

Comments

@huachangmiao
Copy link
Contributor

Is there a better way to optimize?

@durswd
Copy link
Collaborator

durswd commented Apr 9, 2018

-Program
I have four ideas.

SIMD. Some parts can be improved with NEON
Multicore. parallel computing with multi thread. It is required to many changes.
Reduce a calculation. If parameters are default, skip calculations.
GPU. Some calculations moves to GPU from CPU. It is required to many changes.

@durswd
Copy link
Collaborator

durswd commented Apr 12, 2018

I saw cocos2d-x form. I think it is strange.

@huachangmiao
Copy link
Contributor Author

sorry. what does form mean?

@durswd
Copy link
Collaborator

durswd commented Apr 12, 2018

Cocos2d-x Forums. I'm sorry for typo

@huachangmiao
Copy link
Contributor Author

i think so too.
the performance increase a lot of when i mark the drawElement function.

@huachangmiao
Copy link
Contributor Author

maybe something wrong in rendering

@durswd
Copy link
Collaborator

durswd commented Apr 12, 2018

I think the effects call 4 or 5 times drawElement by a frame.
Otherwise, it is bug in Effekseer for cocos2d-x or Effekseer1.4(this is beta version)
I'll check it after few days.

@durswd
Copy link
Collaborator

durswd commented Apr 15, 2018

I'm checking how many times drawElement is called by a frame at first.
drawElement is called 3 or 5 times and vertices is not many.
Is it same on your environment?

I think that bottle neck is a sending time of vertecies data between gpu and cpu.

Next I check it on smartphone and find bottle necks.

Homing_Laser01
3
image

Simple_Distortion
5
image

Sword_Ember is only for PC.

@durswd
Copy link
Collaborator

durswd commented Apr 15, 2018

And do you mean that "i mark" is comment out?

@huachangmiao
Copy link
Contributor Author

yes.

@durswd
Copy link
Collaborator

durswd commented Apr 16, 2018

I'm very sorry for my iPhone's connector is something wrong.
I changed a lightning cable, but it is not recognized.
So, I buy new smartphones.
Please wait for few days.

@durswd
Copy link
Collaborator

durswd commented Apr 17, 2018

I bought iPhoneSE which has a same spec to iPhone6s.
I played homing and distortion. But it maintains 60 fps.
Would you please send your project to me?

@huachangmiao
Copy link
Contributor Author

Add a few more items, how many can maintain fps60?

@durswd
Copy link
Collaborator

durswd commented Apr 17, 2018

I try it

@durswd
Copy link
Collaborator

durswd commented Apr 17, 2018

I'm sleeping. In japan, it is 1 o'clock. I return a result tommorow

@huachangmiao
Copy link
Contributor Author

have a good dream

@durswd
Copy link
Collaborator

durswd commented Apr 18, 2018

Three homings decreases FPS under 60fps (40~60fps)
But I got a good hint for improving.
I try to optimize it on this weekend.

@huachangmiao
Copy link
Contributor Author

Looking forward to your good news.

@durswd
Copy link
Collaborator

durswd commented Apr 22, 2018

Good news.
I realized to improve performance on iOS.
2 times faster at leaset.

This version has not be tested yet on Android, so this version is on branch optimized

https://github.com/effekseer/EffekseerForCocos2d-x/tree/optimized

And you need to change as follows.

manager = efk::EffectManager::create(rsize, 8000);

8000 is the maximum number of sprites generated in an application.

@huachangmiao
Copy link
Contributor Author

How about the performance of the unity platform in iOS?
There are more than 100 roles in my game's battle scene.
I think play 10 homings maintains 60 fps is OK. now still 3.
T _ T

@durswd
Copy link
Collaborator

durswd commented Apr 23, 2018

This is my personal opinion.

Effekseer is optimizing and update a few days after.
But even if Effekseer for Unity is updated, I think 4 or 5 is maximum homings maintaining 60 fps.

It is not opinion for effects.
Unity is difficult to maintain 60 fps than cocos2d-x because C# and GC (C++ is very fast)

I think that to realize 100 roles and rich effects at the same time is very difficult on current smartphone.
To reduce effect sprites and contrive roles is required.

@huachangmiao
Copy link
Contributor Author

huachangmiao commented Apr 23, 2018

Still thank you for everything.
This is my game. Already released.
http://yxwd.qq.com
In this time, we want to make a 3D game.
Let's optimize together.

@durswd
Copy link
Collaborator

durswd commented Apr 23, 2018

Looks very good.
I will help you.

To optimize it, I have any ideas.

  1. Homing is not light effect.
    Homing is a sample for PC. So it is not optimized.
    Heavier than it looks.

We are implementing Effekseer 1.4, which is multi platform version and going to show draw call on the editor. It can easy to optimize.

  1. Culling
    Effekseer has a culling system to hide effects on out of view.
    But it is not used current Effekseer for cocos2d-x.
    If this function is enabled, effects on the view is only shown.

3, Multithreading
I need to some time to implement it.

  1. OpenGL ES3.0
    If We can use opengl ES 3.0, It may improve a performance.

I have questions.

  1. Is your 3D game TPS, FPS or isometric?

  2. Many large effects like Homing is shown at the same time?
    Or 2 or 3 large effects and many hit effects?

If you cannot show an information on a public, please send a mail.

@huachangmiao
Copy link
Contributor Author

huachangmiao commented Apr 23, 2018

just like http://yxwd.qq.com this game.
but 3D scene, 3D roles, 3D effect and more roles.

@huachangmiao
Copy link
Contributor Author

huachangmiao commented Apr 23, 2018

When does the 1.4 version be released?
There are still many users using android by opengl 2.0. so I have to use the opengl2.0.

@durswd
Copy link
Collaborator

durswd commented Apr 23, 2018

Thank you. I'll play your game and check it.

I plan to release 1.4 beta in early May on Github.

@huachangmiao
Copy link
Contributor Author

thx. I will follow it.

@huachangmiao
Copy link
Contributor Author

It's can play 50 homings maintaining 60 fps.
You just need build the iOS project by release mode.
orz.
that's great.

@huachangmiao
Copy link
Contributor Author

huachangmiao commented Apr 25, 2018

When the number of vertices exceeds a certain value, the display will be wrong.
maybe vertices > 65535.
You can try play 50 Laser03.efk.
8000 is the maximum number of sprites generated in an application.
may I set this param more than 8000?

@durswd
Copy link
Collaborator

durswd commented Apr 25, 2018

  1. performance
    I try it (I checked on release mode. But i check it up to 6, because a display is filled.)

  2. vertices > 65535
    Because vertex id is managed as short.
    I need to sprite rendering.

You can set this param more than 8000.
But if this parameter is larger than 65535 / 6(the number of index on a sprite), rendering may be something wrong or invalid.

Thank you your information, I can fix these bugs.

@huachangmiao
Copy link
Contributor Author

I almost completed the battle of my project.
I test the performance in some devices.

iPhone6/iPhone6s/iPhone7/iPhone8/iPhoneX : Perfect performance.
huawei CUN-AL00(GPU none): acceptable
huawei 荣耀V10(GPU mali): very bad

Very strange.
荣耀V10's hardware is higher than CUN-AL00, but performance is worse.

Is this article helpful?
https://community.arm.com/graphics/f/discussions/6657/how-to-gain-performance-through-pbo-pixel-buffer-object-on-mali-t-880

@durswd
Copy link
Collaborator

durswd commented May 29, 2018

I feel very strange too
I read it.

I have questions.

  1. How many is it FPS on CUN-AL00 and 荣耀V10?
  2. 荣耀V10's display resolution is 4 times larger than CUN-AL00. Do you decrease a frame buffer's resolution?
  3. Do you use maximum vertex count?

I trying to parallel computing now. I hope this optimization help you.

@huachangmiao
Copy link
Contributor Author

  1. I locked the FPS to 30/s.
    iOS: 30/s.
    CUN-AL00 20-25/s.
    荣耀V10 12/s.

  2. I draw all objects in a 1136 * 640 renderTexture. then draw the renderTexture scale to device display resolution.

  3. I set the maximum vertex count to 8000.

by the way.
I just play 2-3 effects at the same time.

@durswd
Copy link
Collaborator

durswd commented May 29, 2018

Can you send me this effect?

@huachangmiao
Copy link
Contributor Author

skills.zip

@durswd
Copy link
Collaborator

durswd commented May 29, 2018

Thank you. I will check it. I'm sleeping.

@huachangmiao
Copy link
Contributor Author

Thank you. ; )

@durswd
Copy link
Collaborator

durswd commented May 29, 2018

I will try to use it.
https://developer.arm.com/products/software-development-tools/graphics-development-tools/mali-graphics-debugger

@huachangmiao
Copy link
Contributor Author

I close the mapbuffer and bufferrange.
It‘s also very slow in huawei honor v10.

@durswd
Copy link
Collaborator

durswd commented May 30, 2018

Does this change get enough performance?
I think mapbuffer and bufferrange are not good for current Android (error and performance)

@huachangmiao
Copy link
Contributor Author

huachangmiao commented May 30, 2018

No improvement at all. Still 10-12fps/s.
I try to disable the NEON tomorrow.

@durswd
Copy link
Collaborator

durswd commented May 30, 2018

OK.

@huachangmiao
Copy link
Contributor Author

huachangmiao commented May 31, 2018

Finally, I found the reason.

void Renderer::setupVBO()
{
glGenBuffers(2, &_buffersVBO[0]);
// Issue #15652
// Should not initialize VBO with a large size (VBO_SIZE=65536),
// it may cause low FPS on some Android devices like LG G4 & Nexus 5X.
// It's probably because some implementations of OpenGLES driver will
// copy the whole memory of VBO which initialized at the first time
// once glBufferData/glBufferSubData is invoked.
// For more discussion, please refer to cocos2d/cocos2d-x#15652
// mapBuffers();
}
I set the maximum vertex count to 100.

It's running OK. 30fps/s

How to fix it?

@durswd
Copy link
Collaborator

durswd commented May 31, 2018

Thank you !!!!!!!!!
I try to think how to fix. (perhaps divide two draw call)

By the way, may I remove division count parameter in ring and track?
I think it is already not needed.

-So I have a new idea.
-Can the Track and the Ribbon set the count of vertices? just like the Ring.

@huachangmiao
Copy link
Contributor Author

I think it can be removed. : )

@durswd
Copy link
Collaborator

durswd commented May 31, 2018

OK!! Please wait for few days.
Until I fix it, you set the maximum vertex count to 100

@durswd
Copy link
Collaborator

durswd commented Jun 1, 2018

https://github.com/effekseer/EffekseerForCocos2d-x

I updated. Is it OK?

// large buffer make application slow on Android
int32_t spriteSize = 600;

#if (CC_TARGET_PLATFORM == CC_PLATFORM_IOS || CC_TARGET_PLATFORM == CC_PLATFORM_ANDROID)
renderer2d = ::EffekseerRendererGL::Renderer::Create(spriteSize, EffekseerRendererGL::OpenGLDeviceType::OpenGLES2);
#else
renderer2d = ::EffekseerRendererGL::Renderer::Create(spriteSize, EffekseerRendererGL::OpenGLDeviceType::OpenGL2);
#endif

where spriteSize is larger than 100, but VBO size is lower than 65536.

@huachangmiao
Copy link
Contributor Author

huachangmiao commented Jun 4, 2018

I'm sorry my reply was late. I spent some time trying.

  • When playing a lot of effect will be wrong if changed spriteSize to 600. And the performance also will be affected on the android devices like huawei honor v10.

I refer to the cocos rendering.
Because some implementations of OpenGLES driver will copy the whole memory of VBO which initialized at the first time.
I think glBufferData should be called before draw every time.
I comment the glBufferData's call in VertexBuffer's creator function. and add this function into VertexBuffer::Unlock().
I built it in huawei honor v10. It 's working well.
I think at least in the android platform should be modified like this.
What do you think?
default

@durswd
Copy link
Collaborator

durswd commented Jun 4, 2018

I think it is OK.
But it should be included with #ifdef __ANDROID__
It seem that this implementation decrease a performance on iOS and PC.

@durswd
Copy link
Collaborator

durswd commented Jun 4, 2018

And another question,
Is nBufferingMode true? If this option makes no sense, I want to remove this.

@huachangmiao
Copy link
Contributor Author

OK. This is just a temporary code. I will include #ifdef ANDROID later.

nBufferingMode is true in android opengles2.0?
image

nBufferingMode = !GLExt::IsSupportedBufferRange() && GLExt::IsSupportedMapBuffer();

@durswd
Copy link
Collaborator

durswd commented Jun 4, 2018

I think it is true.
If you have a time, please would you check a performance inserting nBufferingMode = false?

@huachangmiao
Copy link
Contributor Author

OK, I will check it.

@huachangmiao
Copy link
Contributor Author

huachangmiao commented Jun 4, 2018

I tested in
huawei CUN-AL00:
nBufferingMode = false; 38/fps
nBufferingMode = true; 41/fps

ZTE C880U:
nBufferingMode = false; 41/fps
nBufferingMode = true; 42/fps

I think you can remove Triple buffering. It 's using with mutiThread render. But cocos is simpleThread render.
mapBuffer shouldn 't be removed, because it can improve performance on android.

I optimize the code.

image

@durswd
Copy link
Collaborator

durswd commented Jun 4, 2018

I fixed it.
(Removed nBuffering and add ANDORID)

https://github.com/effekseer/EffekseerForCocos2d-x

Is it OK?

@huachangmiao
Copy link
Contributor Author

huachangmiao commented Jun 5, 2018

Thank you.
This is a little different.

image

image

@durswd
Copy link
Collaborator

durswd commented Jun 5, 2018

Thank you, I will try to fix

@durswd
Copy link
Collaborator

durswd commented Jun 5, 2018

@huachangmiao
Copy link
Contributor Author

It's OK

@durswd
Copy link
Collaborator

durswd commented Jun 6, 2018

Thank you for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants