New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
julia: expand nucleotide-count notes #1769
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great explanation :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wow, that's much more detailed than I expected. Great work!
```nasm | ||
; Where: | ||
; r14 is the address of the start of the utf8 array | ||
; rbx is the address of the start of the count array | ||
; rcx is the length of the utf8 array | ||
|
||
; Check utf8 isn't empty | ||
cmp rcx rcx | ||
je endofloop | ||
; init rdx to 0 | ||
xor rdx rdx | ||
countloop: | ||
; Get a byte from the utf8 array | ||
movzx edi, byte ptr [r14 + rdx + 8] | ||
; Increment the value in the count array at offset (byte * 8) | ||
; (* 8 because our array is of 64 bit ints = 8 bytes) | ||
inc qword ptr [rbx + 8*rdi] | ||
; Increment our index into the utf8 array | ||
inc rdx | ||
; If we haven't reached the end of the array yet, go again | ||
cmp rcx, rdx | ||
jne countloop | ||
endofloop: | ||
; ... | ||
``` | ||
|
||
For our code above, Julia produces pretty similar results: | ||
|
||
(I used `code_native(count_nucleotides4, (String,); syntax=:intel)` which you could read as "Show me the native code produced when compiling `count_nucleotides4` with the argument types `(String,)` (i.e. one argument that is a `String`), and show me it in Intel's assembly syntax". You could also use `@code_native count_nucleotides4("foo")` and you'll get the same stuff but in the AT&T assembly format.) | ||
|
||
```nasm | ||
L320: | ||
movzx edx, byte ptr [r14 + rcx] | ||
; Load the address of counts (I don't know why this happens | ||
; repeatedly, but whatever). | ||
mov rdi, qword ptr [rax] | ||
inc qword ptr [rdi + 8*rdx] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bergjohan If you have some time, I'd appreciate it if you could consider this question briefly? Maybe there's an obvious answer that I am ignorant of!
Do you know if there's a reason for this bit mov rdi, qword ptr [rax]
? I think that's loading the address of rax
into rdi
, but I don't understand why it would be done inside a loop where it doesn't look like rax
is getting modified?
The complete ASM for that loop:
L320:
movzx edx, byte ptr [r14 + rcx]
; Load the address of counts (I don't know why this happens
; repeatedly, but whatever).
mov rdi, qword ptr [rax]
inc qword ptr [rdi + 8*rdx]
lea rdx, [rcx + 1]
add rcx, -7
cmp rcx, qword ptr [r14]
mov rcx, rdx
jne L320
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, cool feature, being able to output assembly like this!
Yeah, I don't see any reason why we wouldn't be able to move the mov rdi, qword ptr [rax]
line outside of the loop, instead of executing it in every iteration.
It seems like rax
is a pointer to the counts
array, so we're loading 8 bytes from the address that rax
points to.
In C this would look like: https://godbolt.org/z/3344zv
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's a nice feature! Thanks for the second opinion <3
No description provided.