Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minimal native support #3762

Closed

Conversation

bsansouci
Copy link
Contributor

@bsansouci bsansouci commented Aug 18, 2019

Hey @bobzhang !
I realized that it'd probably better to make a PR with a minimal support of native compilation, rather than go down the rabbit hole of making Belt work, mostly because a minimal native support is already fun and useful.
It might look scary, but most changes are hidden under #if BS_NATIVE then. The only changes that aren't hidden under that flag are done to help make the diff more readable, and don't cause a semantic change. For example I pass the backend arg to all functions, which defaults to Js, but it's not used by any function when compiled with BS_NATIVE=false.
For native compilation, I have to nest the build artifacts under lib/bs/js / lib/bs/bytecode / lib/bs/native to make sure the files didn't step on each other (like the cmi which aren't the same between JS and native/bytecode). That nesting is also compiled away for bsb, of course, but I do pass nested (the directory name) to the functions that need them.

Here is a list of things that PR does.

  • Add functions to access the ocaml stdlib vendored in bucklescript (Bsb_build_util)
  • Add some custom exceptions for native compilation.
  • Add -bs-binary-simple-ast flag to bsc, which generates a simple AST file (which ocamlc / ocamlopt understands)
  • Fix the .merlin file generation
  • Adds BS_NATIVE_PPX to differentiate between the part of the ppx used
    inside bsc and the standalone ppx used by bsb-native for compiling
    belt to native.
  • Make Bsb_ninja_check also check if the -backend argument changed to know to rebuild or not
  • Fixed old merged BS_NATIVE stuff (including bsb_helper)
  • Pass backend to all the functions, but compile out the -backend flag
  • Add a whole new file called bsb_ninja_native which handles outputting the ninja file for building, linking and packing. Super similar to Bsb_ninja_file_groups.
  • Modified Bsb_ninja_gen to add the global variables we need for native compilation and call the new Bsb_ninja_native.handle_file_groups function.
  • Make Bsb_world collect a list of the dependencies in topological order (since we're already iterating over them).
  • Simplify bsb_helper_depfile_gen
  • Remove -bs-super-errors from linker/packer for now (that requires changing the compiler to accept that flag and enable support error)
  • Make Ext_bytes.escape also escape $, I think because we might pass arguments that contain those characters.
  • Update scripts/ninja.js to take a -native flag to generate the ninja files with BS_NATIVE=true. This allows me to build the whole thing running ninja.js config -native && ninja.js build. Making a release will require a couple more changes but those can be done in a subsequent PR.

@bobzhang
Copy link
Member

hi @bassjacob thanks for keeping working on this, I will do the review and get back to you asap

Copy link
Member

@bobzhang bobzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I made some suggestions, the general idea is that you can make use lazy evaluation to reduce the changes as much as possible (so you don't need pass more arguments), it is also more efficient

jscomp/bsb/bsb_build_util.ml Outdated Show resolved Hide resolved
let json = Ext_json_parse.parse_json_from_file Literals.bsconfig_json in
match json with
| Obj {map} -> extract_main_entries map
| _ -> assert false

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The json file is parsed several time, shall we add a lazy field to avoid parsing it several times like

let json_lazy = lazy (Ext_json_parse.parse_json_... )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea! But it seems like it currently depends on cwd, do you think we can drop that dependency?
So instead of Ext_json_parse.parse_json_from_file (cwd // Literals.bsconfig_json) we would do Ext_json_parse.parse_json_from_file Literals.bsconfig_json.

@@ -86,18 +98,27 @@ let output_merlin_namespace buffer ns=
Buffer.add_string buffer "-open ";
Buffer.add_string buffer x

#if BS_NATIVE then
let bsc_flg_to_merlin_ocamlc_flg bsc_flags backend =
#else
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could backend be made a lazy value so that the patch does not need be intrusive (the argument does not need be passed across each function), so is nest argument

let backend = lazy (...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying that it should be in some module that I can access like Lazy.force Bsb_args.backend, or something like that, so that everywhere that needs backend would read it from there?
One issue I see is that it's more than just a value because it depends on reading the bsconfig, so it depends on cwd. Similar to my previous comment above, do you think we should just get rid of passing cwd down or even just use Sys.getcwd () instead?

| Bsb_config_types.Native -> Literals.native
| Bsb_config_types.Bytecode -> Literals.bytecode
in
#end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same thing, we can push it into lazy evaluation since it only needs to be evaluated once during the lifetime of such process

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that really an optimization that is worth it? I didn't think that the matching would be worse than calling Lazy.force in a bunch of places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that it doesn't take any args and it still has a side effect, do you think I should use lazy? It might make sense, I'm just scared of having the issue of calling this too early, because the backend is parsed and be stuck with that wrong value.

@@ -28,6 +28,7 @@ type t =
dir_or_files : string array ;
st_mtimes : float array;
source_directory : string ;
cmdline_backend: Bsb_config_types.compilation_kind_t;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we don't need add a new field, we can just change the name of file, for js backend its name is xx, for native backend its name is yy so they don't conflict with each other

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking to reduce such conflicts even for ninja file name, they can pick different names so less conflict is introduced, then you can call ninja -f xx.ninja

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh interesting! I can append _js, _bytecode and _native to the file name, no problem.
I don't think we need to do the ninja file name because I have to put the build artifacts in different folders anyway!

@@ -170,16 +188,19 @@ let make_custom_rules
);
if has_ppx then
Buffer.add_string buf " $ppx_flags";
if gen_simple then
Buffer.add_string buf " -bs-simple-binary-ast";
Buffer.add_string buf " $bsc_flags -o $out -bs-syntax-only -bs-binary-ast $in";
Buffer.contents buf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you generate .d file in this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain a bit what you mean?
The .d file handling's done in bsb_helper_depfile_gen. I don't see what this change relates to .d file generation.

(fun acc v -> match String_map.find_opt module_to_filepath v with
| Some file -> (file ^ suffix_object_files) :: acc
| None -> failwith @@ "build.ninja is missing the file '" ^ v ^ "' that was used in the project. Try force-regenerating but this shouldn't happen."
)
[]
sorted_tasks in
let warning_command = if String.length warnings > 0 then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to help me review it easier, the changes to bsb_native alone could be sent separately and merged quickly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to bsb_helper?

#if BS_NATIVE then
| '$' ->
Bytes.unsafe_set s' !n '$'; incr n; Bytes.unsafe_set s' !n '$'
#end
| '\n' ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is this function used in bs_native? The escaping rules does not seem correct to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I meant to escape : I think. Because ninja doesn't like : and bsb_native generates some windows path that are absolute so C:\something and ninja dies on those.

What's not correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you read the docs of {String.ecaped}, it escape the special character following ocaml lexical convention which is not relevant to windows path convention, I guess you need a special function for windows path escaping

@@ -1588,7 +1589,7 @@ function nativeNinja() {
var templateNative = `
subninja ${getPreprocessorFileName()}
rule optc
command = $ocamlopt -safe-string -I +compiler-libs -opaque ${includes} -g -w +6-40-30-23 -warn-error +a-40-30-23 -absname -c $in
command = BS_NATIVE=${BS_NATIVE} $ocamlopt -safe-string -I +compiler-libs -opaque ${includes} -g -w +6-40-30-23 -warn-error +a-40-30-23 -absname -c $in
description = $out : $in
rule archive
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is this env variable read?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is read by the native compiler! It uses env vars for the #if BS_NATIVE macros.

Copy link
Contributor Author

@bsansouci bsansouci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed some fixes and added a bunch of questions!

jscomp/bsb/bsb_build_util.ml Outdated Show resolved Hide resolved
let json = Ext_json_parse.parse_json_from_file Literals.bsconfig_json in
match json with
| Obj {map} -> extract_main_entries map
| _ -> assert false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea! But it seems like it currently depends on cwd, do you think we can drop that dependency?
So instead of Ext_json_parse.parse_json_from_file (cwd // Literals.bsconfig_json) we would do Ext_json_parse.parse_json_from_file Literals.bsconfig_json.

@@ -86,18 +98,27 @@ let output_merlin_namespace buffer ns=
Buffer.add_string buffer "-open ";
Buffer.add_string buffer x

#if BS_NATIVE then
let bsc_flg_to_merlin_ocamlc_flg bsc_flags backend =
#else
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying that it should be in some module that I can access like Lazy.force Bsb_args.backend, or something like that, so that everywhere that needs backend would read it from there?
One issue I see is that it's more than just a value because it depends on reading the bsconfig, so it depends on cwd. Similar to my previous comment above, do you think we should just get rid of passing cwd down or even just use Sys.getcwd () instead?

@@ -28,6 +28,7 @@ type t =
dir_or_files : string array ;
st_mtimes : float array;
source_directory : string ;
cmdline_backend: Bsb_config_types.compilation_kind_t;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh interesting! I can append _js, _bytecode and _native to the file name, no problem.
I don't think we need to do the ninja file name because I have to put the build artifacts in different folders anyway!

@@ -170,16 +188,19 @@ let make_custom_rules
);
if has_ppx then
Buffer.add_string buf " $ppx_flags";
if gen_simple then
Buffer.add_string buf " -bs-simple-binary-ast";
Buffer.add_string buf " $bsc_flags -o $out -bs-syntax-only -bs-binary-ast $in";
Buffer.contents buf
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain a bit what you mean?
The .d file handling's done in bsb_helper_depfile_gen. I don't see what this change relates to .d file generation.

(fun acc v -> match String_map.find_opt module_to_filepath v with
| Some file -> (file ^ suffix_object_files) :: acc
| None -> failwith @@ "build.ninja is missing the file '" ^ v ^ "' that was used in the project. Try force-regenerating but this shouldn't happen."
)
[]
sorted_tasks in
let warning_command = if String.length warnings > 0 then
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to bsb_helper?

#if BS_NATIVE then
| '$' ->
Bytes.unsafe_set s' !n '$'; incr n; Bytes.unsafe_set s' !n '$'
#end
| '\n' ->
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I meant to escape : I think. Because ninja doesn't like : and bsb_native generates some windows path that are absolute so C:\something and ninja dies on those.

What's not correct?

@@ -1588,7 +1589,7 @@ function nativeNinja() {
var templateNative = `
subninja ${getPreprocessorFileName()}
rule optc
command = $ocamlopt -safe-string -I +compiler-libs -opaque ${includes} -g -w +6-40-30-23 -warn-error +a-40-30-23 -absname -c $in
command = BS_NATIVE=${BS_NATIVE} $ocamlopt -safe-string -I +compiler-libs -opaque ${includes} -g -w +6-40-30-23 -warn-error +a-40-30-23 -absname -c $in
description = $out : $in
rule archive
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is read by the native compiler! It uses env vars for the #if BS_NATIVE macros.

| Bsb_config_types.Native -> Literals.native
| Bsb_config_types.Bytecode -> Literals.bytecode
in
#end
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that really an optimization that is worth it? I didn't think that the matching would be worse than calling Lazy.force in a bunch of places.

@bsansouci
Copy link
Contributor Author

Friendly ping @bobzhang :)

@bobzhang
Copy link
Member

bobzhang commented Sep 2, 2019

Friendly ping @bobzhang :)

Looking into it

Copy link
Member

@bobzhang bobzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @bsansouci I would like to merge you patch to unblock your work, but the diff is so big that is really hard to do a review.
I made a diff #3798 (hopefully to help reduce your changes)

let get_ocaml_lib_dir ~is_js ~cwd =
if is_js then (Filename.dirname (get_bsc_dir ~cwd)) // "lib" // "ocaml"
else (get_ocaml_dir ~cwd) // "lib" // "ocaml"
#end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see Bsb_global_paths module so that you don't need function here.
Also I see is_js always passed false here, why do we need a parameter for it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this to be inside Bsb_global_paths! Thanks

| Other s -> s)

let rec check_aux cwd (xs : string array) (ys: float array) i finish =
let rec check_aux cwd ~nested (xs : string array) (ys: float array) i finish =
if i = finish then Good
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's nested? can we make the diff as small as possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, that was very useless. I used your change to make this less intrusive (I think) by ~200 lines :D

@bsansouci bsansouci force-pushed the minimal-native-support branch 2 times, most recently from dc9c324 to 3dd2db1 Compare September 6, 2019 02:34
| None ->
raise (Arg.Bad ("don't know what to do with " ^ fn))
end
#end
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is part of my cleanup of old broken stuff.

@bsansouci bsansouci force-pushed the minimal-native-support branch 2 times, most recently from fc91112 to eb81d9f Compare September 6, 2019 03:07
@bsansouci
Copy link
Contributor Author

Updated the diff! It's quite a bit smaller thanks for that!
I added Bsb_global_backend which encapsulates what we talked about. Tell me what you think.

bsansouci added a commit to bsansouci/bucklescript that referenced this pull request Oct 2, 2019
And this sets us better for rescript-lang#3762 :)
This shouldn't affect bsb / bsc, everything's compiled away.
@bsansouci bsansouci force-pushed the minimal-native-support branch 2 times, most recently from 01f0b36 to a1fb3e3 Compare October 26, 2019 06:07
@@ -89,6 +89,9 @@ val sort_imports : bool ref
val syntax_only : bool ref
val binary_ast : bool ref

#if BS_NATIVE then
val simple_binary_ast : bool ref
#end

val bs_suffix : bool ref
val debug : bool ref
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not have to be on conditional compilation, we can support it natively in a separate PR

let oc = open_out_bin (outputprefix ^ Literals.suffix_mliast_simple) in
Ml_binary.write_ast Mli !Location.input_name ast oc;
close_out oc ;
end;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

#if BS_NATIVE then
| '$' ->
Bytes.unsafe_set s' !n '$'; incr n; Bytes.unsafe_set s' !n '$'
#end
| '\n' ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you read the docs of {String.ecaped}, it escape the special character following ocaml lexical convention which is not relevant to windows path convention, I guess you need a special function for windows path escaping

| Asttypes.Nolabel -> assert false
| Optional s
| Labelled s -> s
#else
if arg_name.[0] = '?' then
String.sub arg_name 1 (String.length arg_name - 1)
else arg_name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make it a separate PR

Make it build in 4.06!

Fixes to make sure this stuff compiles
@bsansouci
Copy link
Contributor Author

Closing! Working on a simpler approach.

@bsansouci bsansouci closed this Mar 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants