Advent of Numba

Solutions to Advent of Code using Numba. Some notes on the solutions:

Numba 0.52 is required to run the solutions.
- I may use features that only appear in the master branch in later days.
I will try to use CUDA for each solution.
Most solutions will be the easiest for me to write.
- This might mean a lot of brute force, due to the capabilities of a GPU and the low effort needed to invent brute force solutions.
- The solutions will probably not be close to performance-optimal.
I will try to demonstrate something "interesting" about the CUDA target for each solution. E.g.:
- Atomic operations
- Cooperative grids
- etc.
I will try to annotate each solution to explain to a beginner the rationale behind the implementation.
I will probably fall a few days behind.
I am not optimistic about finishing all 24 days.

Please direct comments / questions / criticisms / veneration to: @gmarkall.

Solutions

Links to solutions and some interesting features of them:

Day 1: 2D / 3D grids, atomic exchange for stores.
Day 2: Atomic increment, structured arrays.
Day 3: Building reduction kernels with @cuda.reduce, host to device transfers to elide unnecessary copying
Day 4: I didn't finish doing this on the GPU.
- Pure Python Solution
- A start at the GPU solution
- Jacob Tomlinson had the good sense to use cuDF for this one. Check out his solution!
Day 5: Cooperative Groups (grid group / grid sync) and device functions.
Day 6: Python solution only so far. Will need to re-visit to complete a CUDA implementation - should be doable, but I'm under time constraints.
Day 7: Python solution only so far. Will probably not do a CUDA implementation of this one as it doesn't easily map to a GPU.
Day 8: Sharing the core computation of an implementation on both the CPU and GPU targets by calling an @njit function from a @cuda.jit function.
Day 9: Demonstrates some changes and workarounds needed when porting a pure Python code to the CUDA target - e.g. involving lists, array slicing, array functions.
- Python reference solution
- CUDA solution with commentary on workarounds

Other approaches

Jacob Tomlinson is also using Numba CUDA to solve AoC. He is also streaming and recording his work:
- YouTube
- Twitch

Notes

I'm using this section to collect thoughts I have whilst working on solutions about improving the usability and accessibility of Numba and the CUDA target.

Nice-to-haves:

Ability to call atomic inc without specifying a maximum (e.g. cuda.atomic.max(arr, idx) (day 2).
The ability to return things from kernels (every day).
- Kernel launches are asynchronous, so this could return a future.
- Alternatively, allow an optional blocking launch to directly return the result.
A library of small sort functions (day 4).
- E.g. a function for a block to cooperate sorting a small array,
- A whole-grid sort for larger arrays,
- etc.
Better string op support (day 2).
- E.g. allow passing strings or arrays of bytes to kernels.
- Lots of lowering of string operations missing in CUDA (but probably present for nopython mode).
Support for a better print, for "prinf debugging" (all days)
- There is a printf-like function somewhere (in libdevice?) that can format strings that could be used.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
day01		day01
day02		day02
day03		day03
day04		day04
day05		day05
day06		day06
day07		day07
day08		day08
day09		day09
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Advent of Numba

Solutions

Other approaches

Notes

About

Releases

Packages

Languages

License

gmarkall/advent-of-numba

Folders and files

Latest commit

History

Repository files navigation

Advent of Numba

Solutions

Other approaches

Notes

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages