New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinite collision #43

Closed
gioblu opened this Issue Apr 1, 2016 · 7 comments

Comments

Projects
None yet
3 participants
@gioblu
Owner

gioblu commented Apr 1, 2016

As came out from tests made by Mauro and from suggestions made by the community on the hackaday post, there is a bug. If we code in the same way many devices and they start at the same time, for example, because of a common power supply, they will likely all collide and in the same way execute quadratic backoff always colliding or finding the channel BUSY.

So I see 2 possible cases (both leads to collision loop):

  • Startup collision
  • Casual collision

On transmitter side:

  • Add initial random delay on device init (to avoid startup collision)
  • Add random delay if NAK (to avoid casual collision)
  • Add a random delay if FAIL (to avoid casual collision)

Please feel free to share your point / suggest

@gioblu gioblu added bug TODO labels Apr 1, 2016

@morpheus000

This comment has been minimized.

morpheus000 commented Apr 1, 2016

We could try something like nick gammon in his rs485 example did: http://www.gammon.com.au/forum/?id=11428

@thomsna

This comment has been minimized.

thomsna commented Apr 4, 2016

@gioblu Would this implement what you suggested?

int response = bus.send(address, message, 2);

while (1) { 
    if (response == NAK || response == FAIL) {
        delay(random(0, 20));
    }
    bus.update();
}
@gioblu

This comment has been minimized.

Owner

gioblu commented Apr 5, 2016

Ciao @thomsna. Yes, here it is what I am locally testing:

uint16_t PJON::send_string(uint8_t id, char *string, uint8_t length) {
  if(!*string) return FAIL;
  if(!this->can_start()) return BUSY;

  uint8_t CRC = 0;
  pinModeFast(_input_pin, OUTPUT);

  this->send_byte(id);
  CRC ^= id;
  this->send_byte(length + 3);
  CRC ^= length + 3;

  for(uint8_t i = 0; i < length; i++) {
    this->send_byte(string[i]);
    CRC ^= string[i];
  }

  this->send_byte(CRC);
  digitalWriteFast(_input_pin, LOW);

  if(id == BROADCAST) return ACK;

  uint32_t time = micros();
  uint16_t response = FAIL;

  /* Receive byte for an initial BIT_SPACER bit + standard bit total duration.
     (freak condition used to avoid micros() overflow bug) */
  while(response == FAIL && !((uint32_t)(micros() - time) >= BIT_SPACER + BIT_WIDTH))
    response = this->receive_byte();

  if(response == ACK) return response;

  /* Random delay if NAK, corrupted ACK/NAK or collision */
  if(response != FAIL)
    delayMicroseconds(random(0, COLLISION_MAX_DELAY));

  if(response == NAK) return response;

  return FAIL;
};

With this approach, modifying a little the send_string function you are now catching NAKand also other response values like a damaged ACK/NAK or simply a collision with another transmission.

I think FAIL should not be considered as a sure collision, but more likely absence of communication, so I think should be treated differently.

@thomsna

This comment has been minimized.

thomsna commented Apr 5, 2016

Nice to see an improvement is in the pipeline for this matter :)
I guess the key here is this line: delayMicroseconds(random(0, COLLISION_MAX_DELAY));
where (0, COLLISION_MAX_DELAY) is the window defined by quadratic backoff.

@gioblu

This comment has been minimized.

Owner

gioblu commented Apr 5, 2016

Ciao @thomsna I am not sure to have understood your point.

By my side, I think COLLISION_MAX_DELAY should not be shorter than the average oscillation in code execution (temperature, ceramic resonator, different board layout i.e. vias length to clock) but should be as short as possible to reduce bandwidth and computation time loss.

@gioblu

This comment has been minimized.

Owner

gioblu commented Apr 5, 2016

so you let me stumble in this issue: delayMicroseconds(random(0, COLLISION_MAX_DELAY)), minimum should not be zero but this value we should still determine/estimate to go over oscillation.

gioblu added a commit that referenced this issue Apr 7, 2016

@gioblu

This comment has been minimized.

Owner

gioblu commented Apr 7, 2016

Ciao @thomsna today I met with Mauro personally and we have built a network sharing our hardware:

  • 15 Arduino Nano
  • 1 Arduino mega
  • 1 ESP8266 and
  • 1 ATtiny85

With a true multimaster setup (so with bidirectional flux of packets to many devices) and with the new 2.0 version works flawlessly (O_ o we are shocked works really amazingly)!!

The added delays reduced enormously the chance of a collision. We still are considering to setup a test to measure the effective quantity of collisions (if they still occur), probably using a sketch and scoping the channel in parallel with an oscilloscope.

Soon will follow video / gif of the experiment :)

gioblu added a commit that referenced this issue Apr 25, 2016

@gioblu gioblu closed this Apr 25, 2016

@gioblu gioblu removed the TODO label Apr 25, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment