support arrays of any size, don't require an initialization value, .. #4

japaric · 2017-10-03T14:05:01Z

single producer single consumer support for ring buffer

pftbest · 2017-10-03T14:52:36Z

Sorry for interrupting, but I'm curious, why did you decide to use usize for head and tail instead of atomics. Is it for making it work on armv6 which doesn't have atomics? If I understand correctly, volatile operations are only safe on single core CPUs. On multi core CPUs you need some kind of memory fence to share the data in buffer between cores. Best option is release-acquire synchronization using atomics.

pftbest · 2017-10-03T15:34:28Z

@japaric Not so long ago I wrote a similar spsc ring buffer and tested it using thread sanitizer

https://play.rust-lang.org/?gist=4ebd95ec79757a7d2b625cb85d58300c&version=stable

japaric · 2017-10-31T13:45:32Z

@pftbest yes and yes. I'm also mostly interested in using this crate with single core processors (microcontrollers) but it's a good idea to note that this won't work on multicore processors. We can revisit the atomic implementation if / when rust-lang/rust#45085 otherwise it's tricky to write an implementation for ARMv6-M.

japaric · 2017-10-31T17:02:36Z

@homunkulus r+

homunkulus · 2017-10-31T17:02:37Z

📌 Commit da5757a has been approved by japaric

homunkulus · 2017-10-31T17:02:50Z

⌛ Testing commit da5757a with merge d7939bb...

support arrays of any size, don't require an initialization value, .. single producer single consumer support for ring buffer

homunkulus · 2017-10-31T17:49:46Z

💔 Test failed - status-travis

japaric · 2017-10-31T17:50:03Z

@homunkulus retry

homunkulus · 2017-10-31T17:50:12Z

⌛ Testing commit da5757a with merge 10c5542...

support arrays of any size, don't require an initialization value, .. single producer single consumer support for ring buffer

homunkulus · 2017-10-31T18:02:15Z

☀️ Test successful - status-travis
Approved by: japaric
Pushing 10c5542 to master...

these changes optimize `Vec<u8, 1024>::clone` down to these operations 1. reserve the stack space (1028 bytes on 32-bit ARM) and leave it uninitialized 2. zero the `len` field 3. memcpy `len` bytes of data from the parent analyzed source code ``` rust use heapless::Vec; fn clone(vec: &Vec<u8, 1024>) { let mut vec = vec.clone(); black_box(&mut vec); } fn black_box<T>(val: &mut T) { unsafe { asm!("// {0}", in(reg) val) } } ``` machine code with `lto = fat`, `codegen-units = 1` and `opt-level = 'z'` ('z' instead of 3 to avoid loop unrolling and keep the machine code readable) ``` armasm 00020100 <clone>: 20100: b5d0 push {r4, r6, r7, lr} 20102: af02 add r7, sp, #8 20104: f5ad 6d81 sub.w sp, sp, #1032 ; 0x408 20108: 2300 movs r3, #0 2010a: c802 ldmia r0!, {r1} 2010c: 9301 str r3, [sp, #4] 2010e: aa01 add r2, sp, #4 20110: /--/-X b141 cbz r1, 20124 <clone+0x24> 20112: | | 4413 add r3, r2 20114: | | f810 4b01 ldrb.w r4, [r0], #1 20118: | | 3901 subs r1, #1 2011a: | | 711c strb r4, [r3, #4] 2011c: | | 9b01 ldr r3, [sp, #4] 2011e: | | 3301 adds r3, #1 20120: | | 9301 str r3, [sp, #4] 20122: | \-- e7f5 b.n 20110 <clone+0x10> 20124: \----> a801 add r0, sp, #4 20126: f50d 6d81 add.w sp, sp, #1032 ; 0x408 2012a: bdd0 pop {r4, r6, r7, pc} ``` note that it's not optimizing step (3) to an actual `memcpy` because we lack the 'trait specialization' code that libstd uses --- before `clone` was optimized to 1. reserve and zero (`memclr`) 1028 (!?) bytes of stack space 2. (unnecessarily) runtime check if `len` is equal or less than 1024 (capacity) -- this included a panicking branch 3. memcpy `len` bytes of data from the parent

290: optimize the codegen of Vec::clone r=japaric a=japaric these changes optimize `Vec<u8, 1024>::clone` down to these operations 1. reserve the stack space (1028 bytes on 32-bit ARM) and leave it uninitialized 2. zero the `len` field 3. memcpy `len` bytes of data from the parent analyzed source code ``` rust use heapless::Vec; fn clone(vec: &Vec<u8, 1024>) { let mut vec = vec.clone(); black_box(&mut vec); } fn black_box<T>(val: &mut T) { unsafe { asm!("// {0}", in(reg) val) } } ``` machine code with `lto = fat`, `codegen-units = 1` and `opt-level = 'z'` ('z' instead of 3 to avoid loop unrolling and keep the machine code readable) ``` armasm 00020100 <clone>: 20100: b5d0 push {r4, r6, r7, lr} 20102: af02 add r7, sp, #8 20104: f5ad 6d81 sub.w sp, sp, #1032 ; 0x408 20108: 2300 movs r3, #0 2010a: c802 ldmia r0!, {r1} 2010c: 9301 str r3, [sp, #4] 2010e: aa01 add r2, sp, #4 20110: /--/-X b141 cbz r1, 20124 <clone+0x24> 20112: | | 4413 add r3, r2 20114: | | f810 4b01 ldrb.w r4, [r0], #1 20118: | | 3901 subs r1, #1 2011a: | | 711c strb r4, [r3, #4] 2011c: | | 9b01 ldr r3, [sp, #4] 2011e: | | 3301 adds r3, #1 20120: | | 9301 str r3, [sp, #4] 20122: | \-- e7f5 b.n 20110 <clone+0x10> 20124: \----> a801 add r0, sp, #4 20126: f50d 6d81 add.w sp, sp, #1032 ; 0x408 2012a: bdd0 pop {r4, r6, r7, pc} ``` note that it's not optimizing step (3) to an actual `memcpy` because we lack the 'trait specialization' code that libstd uses --- before `clone` was optimized to 1. reserve and zero (`memclr`) 1028 (!?) bytes of stack space 2. (unnecessarily) runtime check if `len` is equal or less than 1024 (capacity) -- this included a panicking branch 3. memcpy `len` bytes of data from the parent Co-authored-by: Jorge Aparicio <jorge.aparicio@ferrous-systems.com>

support arrays of any size, don't require an initialization value, ..

7e91814

single producer single consumer support for ring buffer

japaric mentioned this pull request Oct 3, 2017

Enhancements of Vec and CircularBuffer #3

Closed

japaric added 4 commits October 31, 2017 14:50

don't use indexing to elide bound checks

0968267

update the documentation

e841c8a

add CI

054f291

add examples and cfail tests

da5757a

japaric pushed a commit that referenced this pull request Oct 31, 2017

Auto merge of #4 - japaric:v2, r=japaric

d7939bb

support arrays of any size, don't require an initialization value, .. single producer single consumer support for ring buffer

japaric pushed a commit that referenced this pull request Oct 31, 2017

Auto merge of #4 - japaric:v2, r=japaric

10c5542

support arrays of any size, don't require an initialization value, .. single producer single consumer support for ring buffer

homunkulus merged commit da5757a into master Oct 31, 2017

japaric deleted the v2 branch October 31, 2017 18:02

japaric mentioned this pull request Oct 31, 2017

Indexing / slicing rather than unsafe offset/from_raw_parts? #1

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support arrays of any size, don't require an initialization value, .. #4

support arrays of any size, don't require an initialization value, .. #4

japaric commented Oct 3, 2017

pftbest commented Oct 3, 2017

pftbest commented Oct 3, 2017

japaric commented Oct 31, 2017

japaric commented Oct 31, 2017

homunkulus commented Oct 31, 2017

homunkulus commented Oct 31, 2017

homunkulus commented Oct 31, 2017

japaric commented Oct 31, 2017

homunkulus commented Oct 31, 2017

homunkulus commented Oct 31, 2017

support arrays of any size, don't require an initialization value, .. #4

support arrays of any size, don't require an initialization value, .. #4

Conversation

japaric commented Oct 3, 2017

pftbest commented Oct 3, 2017

pftbest commented Oct 3, 2017

japaric commented Oct 31, 2017

japaric commented Oct 31, 2017

homunkulus commented Oct 31, 2017

homunkulus commented Oct 31, 2017

homunkulus commented Oct 31, 2017

japaric commented Oct 31, 2017

homunkulus commented Oct 31, 2017

homunkulus commented Oct 31, 2017