Possible Ways to Improve
Move the blocking to after the store.save_raw to limit blocking

Current: Get lock -> write 2mb -> built hamt -> unlock -> get lock -> write 2mb -> built hamt.
Improvement -> parallel write 20mb -> get lock on 2mb chunk -> built hamt -> unlock -> second portion is immediately available and no longer required to wait to write 2mb block
Second Improvement Idea:
Use memory store on the hamt kv pairs. This allows parallel writing to IPFS but limiting the blocking since when dealing with memory its much quicker. Use the IPFS hashing to hash the data and properly built the hamts in memory.
After its done, just write the content of the hamts and it the hashes should align.