Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dealing with flicker artifacts? #6

Closed
flowchartsman opened this issue Feb 15, 2023 · 14 comments
Closed

Dealing with flicker artifacts? #6

flowchartsman opened this issue Feb 15, 2023 · 14 comments

Comments

@flowchartsman
Copy link

Hey, thanks a lot for sharing these libraries! I am attempting to use srnoise8.c to generate a nifty flame effect for a keyboard using an RGB led matrix (simulation here).

I've cobbled it together from some various ideas in other projects I've found, and for the most part it looks pretty great, however I'm noticing are some flickering artifacts caused by (I'm assuming) the potential overflow mentioned here.

Is that a reasonable assumption, or is it likely something else? If it is likely the overflow, is there a decent way to mitigate this somewhat?

@flowchartsman
Copy link
Author

flowchartsman commented Feb 15, 2023

I should clarify that I'm actually noticing these values towards the bottom row of the matrix, not the corners, so depending on what "corners" means in the code comment, this might be a red herring, so feel free to close if there's not enough info.

@stegu
Copy link
Owner

stegu commented Feb 15, 2023 via email

@flowchartsman
Copy link
Author

I think that might have been a link to the incorrect simulation, I'm so sorry about that. I've updated the example here, and you can see what I mean: https://wokwi.com/projects/356699012994216961. Keep your eye on the lower left third, and it happens pretty quickly after the simulation starts.

@flowchartsman
Copy link
Author

flowchartsman commented Feb 15, 2023

If it still doesn't appear (this platform seems notorious for not updating correctly, here is the full code:

//simplex noise fire test 
//6x22 simulated matrix for keyboard
#include "FastLED.h"

// Matrix size
#define NUM_ROWS 6
#define NUM_COLS 22
#define NUM_LEDS NUM_ROWS * NUM_COLS
 
// LEDs pin
#define DATA_PIN 5
#define LED_TYPE    WS2812B
#define COLOR_ORDER GRB
 
// LED brightness
#define BRIGHTNESS 255
#define MAX_POWER_MILLIAMPS 800 
 
// Define the array of leds
CRGB leds[NUM_LEDS];

signed char fakesin(unsigned char x) {
  signed char s = x & 0x7F;
  s = s - 64;
  s = abs(2 * s);
  s = (s*s)>>7; // Maps to the FMUL instruction in 8-bit ARM MCUs
  s = 127 - s;
  if (x & 0x80) return -s;
  else return s;
}

signed char fakecos(unsigned char x) {
  unsigned char y = (x + 64); // & 0xFF implied by the type
  return fakesin(y);
}

signed char srnoise8(unsigned short x, unsigned short y, unsigned char alpha) {

  // x and y are inherently 8.8u modulo-256, but are treated
  // algoritmically as 7.8u modulo-128 for better efficiency
  x = x & 0x7FFF; // Hopefully the compiler understands that
  y = y & 0x7FFF; // it doesn't need to touch the low byte here.

  // Skew input coords to staggered grid
  unsigned short u = (x) + (y>>1); // u is 8.8u
  unsigned short v = y; // v is just an alias of y, 7.8u

  // Split to integer and fractional parts
  unsigned char u0 = highByte(u); // u0 is 8.0u
  unsigned char v0 = highByte(v); // v0 is 8.0u (or 7.0u, MSB = 0)

  unsigned char uf0 = lowByte(u); // uf0 is 0.8u
  unsigned char vf0 = lowByte(v); // vf0 is 0.8u
    
  // Determine the second vertex for the simplex
  unsigned char u1;
  unsigned char v1;
  if(uf0 > vf0) {
    u1 = u0 + 1; // u1 is 8.0u
    v1 = v0;     // v1 is 8.0u - see below
  } else {
    u1 = u0;
    v1 = v0 + 1; // This requires 8.0u for v1 (but only just)
  }

  // Third vertex is always (+1, +1)
  unsigned char u2 = u0 + 1; // u2 is 8.0u
  unsigned char v2 = v0 + 1; // v2 is 8.0u
	    
  // Transform ui,vi back to x,y coords before wrap
  signed char x0 = (u0<<1) - v0; // x0 is 7.1s
  // x0 varies in steps of 0.5 from -0.5 to 126.5
  unsigned char y0 = v0; // y0 is alias of v0, 8.0u (or 7.0u)
    
  signed short x1 = (u1<<1) - v1; // x1 is "special" 8.1s:
  // x1 varies in steps of 0.5 from -0.5 to 127.5,
  // hence 8.1s is required, but only just. With a further reduced
  // x,y domain of 6.8u, x1 would be 7.1s and fit in a signed char.
  unsigned char y1 = v1; // y1 is alias of v1, 8.0u

  unsigned char x2 = (u2<<1) - v2; // x2 is 7.1u, will never be negative
  unsigned char y2 = v2; // y2 is alias of v2, 8.0u

  // Compute vectors in x,y coords from vertices
  signed char xf0 = lowByte(x>>1) - (x0<<6); // x is 8.8u, x0 is 7.1u, xf0 is 1.7s
  signed char yf0 = lowByte(y>>1) - (y0<<7); // yf0 is 1.7s
  signed char xf1 = lowByte(x>>1) - (x1<<6); // xf1 is 1.7s
  signed char yf1 = lowByte(y>>1) - (y1<<7); // yf1 is 1.7s
  signed char xf2 = lowByte(x>>1) - (x2<<6); // xf2 is 1.7s, always xf0 - 0.5, room for optimization
  signed char yf2 = lowByte(y>>1) - (y2<<7); // yf2 is 1.7s, always yf0 - 1
    
  // Generate vertex hashes from ui, vi
  unsigned char hash0; // hash0 is 8.0u
  hash0 = (13*u0 + 7)*u0;
  hash0 = hash0 + v0;
  hash0 = (15*hash0 + 11)*hash0;

  // TODO: When u1=u0, we could reuse the u hash from hash0
  unsigned char hash1;
  hash1 = (13*u1 + 7)*u1;
  hash1 = hash1 + v1;
  hash1 = (15*hash1 + 11)*hash1;

  // TODO: When u1=u0+1, we could reuse the u hash from hash1
  unsigned char hash2; // hash2 is 8.0u
  hash2 = (13*u2 + 7)*u2;
  hash2 = hash2 + v2;
  hash2 = (15*hash2 + 11)*hash2;
	
  // Pick gradients from a small set of 8 directions    
  signed char g0x; // All these are 1.7s
  signed char g0y;
  signed char g1x;
  signed char g1y;
  signed char g2x;
  signed char g2y;

  // using +/-0.9921875 (+/-127) instead of
  // +0.9921875/-1.0 (+127/-128) for symmetry
  if (hash0 & 0x01) {
    g0x = 54; g0y = 108;
  } else {
    g0x = 108; g0y = 54;
  };
  if(hash0 & 0x02) {
    g0x = -g0x;
  };
  if (hash0 & 0x04) {
    g0y = -g0y;
  };

  if (hash1 & 0x01) {
    g1x = 54; g1y = 108;
  } else {
    g1x = 108; g1y = 54;
  };
  if(hash1 & 0x02) {
    g1x = -g1x;
  };
  if (hash1 & 0x04) {
    g1y = -g1y;
  };

  if (hash2 & 0x01) {
    g2x = 54; g2y = 108;
  } else {
    g2x = 108; g2y = 54;
  };
  if(hash2 & 0x02) {
    g2x = -g2x;
  };
  if (hash2 & 0x04) {
    g2y = -g2y;
  };

  signed char Ca, Sa;  // 1.7s
  signed char g0x_t, g0y_t, g1x_t, g1y_t, g2x_t, g2y_t; 
  if(alpha != 0) { // Rotate the gradients
    Sa = fakesin(alpha);
    Ca = fakecos(alpha);
    g0x_t = g0x; // all 1.7s, temp storage to prevent trampling on input values
    g0y_t = g0y;
    g1x_t = g1x;
    g1y_t = g1y;
    g2x_t = g2x;
    g2y_t = g2y;
    g0x = ((Ca*g0x_t)>>7) - ((Sa*g0y_t)>>7); // These twelve multiplications constitute a
    g0y = ((Sa*g0x_t)>>7) + ((Ca*g0y_t)>>7); // considerable amount of work for a weak CPU.
    g1x = ((Ca*g1x_t)>>7) - ((Sa*g1y_t)>>7); // Don't use alpha != 0 unless you need it.
    g1y = ((Sa*g1x_t)>>7) + ((Ca*g1y_t)>>7);
    g2x = ((Ca*g2x_t)>>7) - ((Sa*g2y_t)>>7);
    g2y = ((Sa*g2x_t)>>7) + ((Ca*g2y_t)>>7);
  }

  // Compute ramps (g dot u) from vertices
  signed char g0 = ((g0x*xf0)>>7) + ((g0y*yf0)>>7); // g0 is 1.7s
  signed char g1 = ((g1x*xf1)>>7) + ((g1y*yf1)>>7); // g1 is 1.7s
  signed char g2 = ((g2x*xf2)>>7) + ((g2y*yf2)>>7); // g2 is 1.7s
  // Note that g0/g1/g2 will overflow 1.7s at some corners, but
  // the overflow happens only in regions where m0/m1/m2 = 0 and
  // the incorrect, sign-flipped value is multiplied by zero.

  // Compute radial falloff from vertices
  unsigned char r0 = ((xf0*xf0)>>7) + ((yf0*yf0)>>7); // r0 is 1.7u
  unsigned char m0;
  if(r0 > 102) {
    m0 = 0;
  } else {
    m0 = 255 - (r0<<1) - (r0>>1);
    // m0 is 0.8u, "(r0<<1)+(r0>>1)" is 1.25*r0 in 0.8u
    m0 = (m0*m0)>>8; // 8-bit by 8-bit to 8-bit 0.8u mult
    m0 = (m0*m0)>>8;
  }

  unsigned char r1 = ((xf1*xf1)>>7) + ((yf1*yf1)>>7); // r1 is 1.7u
  unsigned char m1;
  if(r1 > 102) {
    m1 = 0;
  } else {
    m1 = 255 - (r1<<1) - (r1>>1); // m1 is 0.8u
    m1 = (m1*m1)>>8;
    m1 = (m1*m1)>>8;
  }

  unsigned char r2 = ((xf2*xf2)>>7) + ((yf2*yf2)>>7); // r2 is 0.8u
  unsigned char m2;
  if(r2 > 102) {
    m2 = 0;
  } else {
    m2 = 255 - (r2<<1) - (r2>>1); // m2 is 0.8u
    m2 = (m2*m2)>>8;
    m2 = (m2*m2)>>8;
  }

  // Multiply ramps with falloffs
  signed char n0 = (g0 * m0)>>6; // g0 is 1.7s, m0 is 0.8u, n0 is 1.7s, scale by 4
  signed char n1 = (g1 * m1)>>6; // mult scaled by 4
  signed char n2 = (g2 * m2)>>6; // mult scaled by 4
  // Factors gi, mi span their ranges and can't be scaled individually,
  // but all of the products (ni) are always < 0.25 and can be shifted
  // left 2 steps for two additional bits of precision.
  // Multiplications in ATmega32 are 8-by-8-to-16 bits, but selecting
  // the "best" bits requires a few extra operations.

  // Sum up noise contributions from all vertices
  signed char n;
  n = (145*(n0 + n1 + n2))>>7; // Scale to better cover the range [-128,127]

  // We're done. Return the result in 1.7s format.
  return n;
}
 
void setup() {
  FastLED.addLeds<LED_TYPE, DATA_PIN, COLOR_ORDER>(leds, NUM_LEDS).setCorrection( TypicalLEDStrip );
  //FastLED.setMaxPowerInVoltsAndMilliamps(5, MAX_POWER_MILLIAMPS);
  FastLED.setBrightness(BRIGHTNESS);
}

uint16_t time = 0; 
void loop() {
// 30 is good for 6x22
#define scalenoise 30 
//int  time = millis();
time++;
  for (byte i = 0; i < NUM_COLS; i++) {
    for (byte j = 0; j < NUM_ROWS; j++) {
      leds[XY(i,j)] = ColorFromPalette (HeatColors_p , qsub8 (srnoise8 (i * scalenoise , j * scalenoise+ time, time/3)+128, abs8(j - (NUM_ROWS-1)) * 255 / (NUM_ROWS+4)), BRIGHTNESS);	
}}
FastLED.show();
}

uint16_t XY (uint8_t x, uint8_t y) { return (y * NUM_COLS + x);}

@flowchartsman
Copy link
Author

flowchartsman commented Feb 15, 2023

In particular, the difference is I actually am using a uint16_t for the time code and simply incrementing it once per loop. Sorry in advance for the runaround if you have to paste the code. I hate to waste your time, but even after saving and re-generating the share link, it was still using the millis() call and not the uint16_t value, which also slows it down somewhat.

@flowchartsman
Copy link
Author

flowchartsman commented Feb 15, 2023

And, just for completeness’ sake, here are a couple examples in situ. The first occurs in the lower right command key area

FullSizeRender.MOV

And the other near “K” and “L”

FullSizeRender.MOV

In both of these cases, it's an incrementing uint16_t that's getting fed to the noise function.

@stegu
Copy link
Owner

stegu commented Feb 15, 2023 via email

@flowchartsman
Copy link
Author

flowchartsman commented Feb 15, 2023

This one?
https://github.com/stegu/perlin-noise/blob/master/src/srnoise8.c#L261

I definitely wouldn't change that without knowing what I was doing. I tried not to modify the code at all.

@flowchartsman
Copy link
Author

flowchartsman commented Feb 15, 2023

I can confirm that this change seems to eliminate the issue, however! At least, mostly. I still see a little flicker on the very bottom row in the "hottest" areas, at times, but it's only really noticeable at smaller time deltas, so that seems like a win to me.

@stegu
Copy link
Owner

stegu commented Feb 15, 2023 via email

@stegu
Copy link
Owner

stegu commented Feb 15, 2023 via email

@stegu stegu closed this as completed Feb 15, 2023
@stegu
Copy link
Owner

stegu commented Feb 15, 2023

Resolved. Thanks for the bug report!

@flowchartsman
Copy link
Author

flowchartsman commented Feb 15, 2023

My pleasure. While I have you, the two seem very similar, right down to the header file for srnoise8 saying armnoise https://github.com/stegu/perlin-noise/blob/master/src/srnoise8.h#L1

Aside from the extra dimension, are there any other notable differences in low-power use?

Edit: I went ahead and filed a chore issue for the header comment, so you can fix that if you want, and if you are feeling magnanimous, maybe we could get a tiny guidance comment on when to use armnoise8 vs. srnoise8.

Thanks again!

@stegu
Copy link
Owner

stegu commented Feb 16, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants