Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

julia: expand nucleotide-count notes #1769

Merged
merged 4 commits into from Oct 1, 2020
Merged

julia: expand nucleotide-count notes #1769

merged 4 commits into from Oct 1, 2020

Conversation

cmcaine
Copy link
Contributor

@cmcaine cmcaine commented Oct 1, 2020

No description provided.

@cmcaine cmcaine requested a review from a team October 1, 2020 08:55
SaschaMann
SaschaMann previously approved these changes Oct 1, 2020
Copy link
Contributor

@SaschaMann SaschaMann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great explanation :)

tracks/julia/exercises/nucleotide-count/mentoring.md Outdated Show resolved Hide resolved
SaschaMann
SaschaMann previously approved these changes Oct 1, 2020
Copy link
Contributor

@SaschaMann SaschaMann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow, that's much more detailed than I expected. Great work!

Comment on lines +106 to +142
```nasm
; Where:
; r14 is the address of the start of the utf8 array
; rbx is the address of the start of the count array
; rcx is the length of the utf8 array

; Check utf8 isn't empty
cmp rcx rcx
je endofloop
; init rdx to 0
xor rdx rdx
countloop:
; Get a byte from the utf8 array
movzx edi, byte ptr [r14 + rdx + 8]
; Increment the value in the count array at offset (byte * 8)
; (* 8 because our array is of 64 bit ints = 8 bytes)
inc qword ptr [rbx + 8*rdi]
; Increment our index into the utf8 array
inc rdx
; If we haven't reached the end of the array yet, go again
cmp rcx, rdx
jne countloop
endofloop:
; ...
```

For our code above, Julia produces pretty similar results:

(I used `code_native(count_nucleotides4, (String,); syntax=:intel)` which you could read as "Show me the native code produced when compiling `count_nucleotides4` with the argument types `(String,)` (i.e. one argument that is a `String`), and show me it in Intel's assembly syntax". You could also use `@code_native count_nucleotides4("foo")` and you'll get the same stuff but in the AT&T assembly format.)

```nasm
L320:
movzx edx, byte ptr [r14 + rcx]
; Load the address of counts (I don't know why this happens
; repeatedly, but whatever).
mov rdi, qword ptr [rax]
inc qword ptr [rdi + 8*rdx]
Copy link
Contributor Author

@cmcaine cmcaine Oct 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bergjohan If you have some time, I'd appreciate it if you could consider this question briefly? Maybe there's an obvious answer that I am ignorant of!

Do you know if there's a reason for this bit mov rdi, qword ptr [rax]? I think that's loading the address of rax into rdi, but I don't understand why it would be done inside a loop where it doesn't look like rax is getting modified?

The complete ASM for that loop:

L320:
        movzx   edx, byte ptr [r14 + rcx]
        ; Load the address of counts (I don't know why this happens
        ; repeatedly, but whatever).
        mov     rdi, qword ptr [rax]
        inc     qword ptr [rdi + 8*rdx]
        lea     rdx, [rcx + 1]
        add     rcx, -7
        cmp     rcx, qword ptr [r14]
        mov     rcx, rdx
        jne     L320

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, cool feature, being able to output assembly like this!

Yeah, I don't see any reason why we wouldn't be able to move the mov rdi, qword ptr [rax] line outside of the loop, instead of executing it in every iteration.

It seems like rax is a pointer to the counts array, so we're loading 8 bytes from the address that rax points to.
In C this would look like: https://godbolt.org/z/3344zv

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's a nice feature! Thanks for the second opinion <3

@cmcaine cmcaine merged commit e348fcc into master Oct 1, 2020
@cmcaine cmcaine deleted the jl-nucs branch October 1, 2020 23:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants