Erlang Binary functions#374
Conversation
…ck/hologram into atavistock/create_erlang_binary
|
Hey @atavistock! Thanks for jumping on this so quickly - really appreciate how thorough the GitHub issues: I created one per function. Please comment anything on each so I can assign you officially:
If you think the shared matching algorithms need their own issue, feel free to create one - though ideally due to how Hologram's compiler works, it's better to inline the code with the functions themselves (see below). Why split: I'm swamped with PRs. Separate PRs let me review faster, unblock functions independently, and make it clear to others which functions are already claimed (per the initiative flow). Code organization reminder: The compiler only sees code that's reachable via Deps definitions in the call graph (see the Contributing Guide). So instead of extracting validation or shared logic into private helpers:
Quick note: The CI formatting check found some issues. You can fix them by running Thanks again - your contributions are awesome, we just have to keep the process tight so review doesn't bottleneck. Once the issues are claimed and PRs split, I can move fast on reviews :) |
|
I've just added "Quality Checks Before Submitting" section here: https://hologram.page/reference/contributing, though to get the updated checks you'll need to sync with the |
|
I'm happy to split these apart into separate incremental PRs. I can certainly understand how it will be much easier to digest whats happening and the changes being made. I would like to understand this a little better:
Erlang uses two fairly complex algorithms done in a C .nif for the pattern handling which are decided based on the input. Each of those algorithms does searching, matching, and replacing differently, but they also share a lot of the same boiler plate. Implementing both of them in javascript takes 600 or 900 lines of javascript depending on if we're using an abstract class. Inlining the code into the mock Erlang calls would make for a massive amount of duplication in each of the binary operations. I don't know the implications for Hologram, but I think that amount of duplication would be a significant problem for maintenance and complexity. If we eschew the simpler algorithm completely, it could make inlining search, match, and replace much more clear, but they'd still share a lot of duplicated code for the alogirthm implementation itself. Curious how you feel we should go here? |
|
Hi Aaron, I get it - the shared code is complex and avoiding duplication makes total sense. I think the best approach is to implement them as "virtual" Erlang functions with underscored names (e.g.,
You could even model class instance-like structures this way if needed. What do you think about this approach? Also, can the matchers be split into 2-3 separate issues (beyond the actual |
|
Okay, so I've actually been struggling with one use case of
`:binary.compile_pattern` and I'm not sure that it's solvable.
In erlang `:binary.compile_pattern` can accept a single binary, a list of
binaries, or a pre-compiled pattern. In javascript land we can accept a
binary or a list of binaries, and recreate an implementation which works
within javascript. We can even work with a pre-compiled pattern that was
created in javascript land.
Unfortunately, on the erlang side if you pass a binary or a list of
binaries in erlang you get back a tuple which looks like `{:bm,
#Reference<0.920665636.557187080.162275>}`. This is fine internally in
erlang, but the reference is completely opaque and not reproducible outside
the c coded nif. There's a specially optimized format which has some
specific magic in that reference, and there's just no way I can find to
extract the original pattern or recreate the pattern in javascript.
I've made `test "returns a tuple for a compiled pattern"` in the file
`test/elixir/hologram/ex_js_consistency/erlang/binary_test.exs` to check
for the error case until there's a clear direction.
…On Fri, Nov 21, 2025 at 10:19 AM Bart Blast ***@***.***> wrote:
*bartblast* left a comment (bartblast/hologram#374)
<#374 (comment)>
Hi Aaron, I get it - the shared code is complex and avoiding duplication
makes total sense.
I think the best approach is to implement them as "virtual" Erlang
functions with underscored names (e.g., :binary._my_fun/1). This way they:
- Stay in binary.mjs with Start/End markers
- Can be declared in Deps and tracked in the call graph
- Can have isolated tests
- Get extracted by the compiler only when needed
- Won't be confused with actual Erlang API functions (underscore
prefix convention)
You could even model class instance-like structures this way if needed.
What do you think about this approach? Also, can the matchers be split
into 2-3 separate issues (beyond the actual :binary function PRs) to make
review more manageable, or do they need to go together in one PR?
—
Reply to this email directly, view it on GitHub
<#374 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAFOBOGQF3IZQ2KN4BBHBL355JSZAVCNFSM6AAAAACMQRD7P6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTKNRUGE2DONRSGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
|
Thanks, I proposed a solution here: #418 (comment) |
|
Closing this PR as the features discussed here have been implemented in the following separate PRs:
Thanks to @atavistock and @mward-sudo for their work on these features! |
#355
The scope of the issue was for
:binary.split/2and:binary.split/3:binary.compile_pattern/1and implementing the existing matching algorithms, which was the bulk of the actual work to get the initial scope finished.:binary.match/2,:binary.match/3,:binary.matches/2,:binary.matches/3,:binary.replace/3, and:binary.replace/4became relatively easy to implement as they also rely on those algoithms for the heavy lifting, and they are consequently included in this pull request.