Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Programatic use of Machine / Scryer as library #1880

Merged
merged 73 commits into from Nov 2, 2023

Conversation

lucksus
Copy link
Contributor

@lucksus lucksus commented Jul 12, 2023

These changes make it possible to run Prolog queries on a Machine instance in a 3rd-party crate that pulls-in Scryer as a library dependency, interfacing with the machine only with in-memory Strings.

Overview

  1. Machine::run_input_once(&mut self) runs the added toplevel predicate run_input_once/0 which reads the query from user input like repl but just executes it once, returning/printing all results
  2. Machine::load_module_string(&mut self, module_name: &str, program: String) creates a stream from the given string and loads like a file
  3. Machine::set_user_input(&mut self, input: String) and Machine::get_user_output(&self) -> String enabling interfacing with the machine
  4. Machine::run_query(&mut self, query: String) -> QueryResult wraps it all up by setting the query as input and parsing the output string into added Rust type QueryResult.

Example usage

let mut machine = Machine::new_lib();

machine.load_module_string("facts", String::from(r#"
    triple("a", "p1", "b").
    triple("a", "p2", "b").
"#));

let output: QueryResult = machine.run_query(String::from(r#"triple("a",P,"b")."#));

assert_eq!(output, QueryResult::Matches(vec![
    QueryMatch::from(btreemap!{
        "P" => Value::from("p1"),
    }),
    QueryMatch::from(btreemap!{
        "P" => Value::from("p2"),
    }),
]));

Next steps

  • ensure parsing of complex result values
  • consult/1 like updating of machine state

@triska
Copy link
Contributor

triska commented Jul 12, 2023

Very interesting work and contribution, thank you a lot for working on this!

The only suggestion I have is that it would be awesome, and I think preferable, to generalize the currently existing function run_top_level so that an arbitrary goal can be invoked as the Scryer Prolog toplevel. Currently, '$repl' is hardcoded:

self.run_module_predicate(atom!("$toplevel"), (atom!("$repl"), 1));

Generalizing this function so that an arbitrary goal can be specified could let you avoid the introduction of the new and seemingly rather ad hoc definition of run_input_once in 112d398. Because: What if some other library user would prefer yet another toplevel instead, such as your "run once" with slight variations, or something else entirely?

Personally, I would prefer to not add such additional toplevels to toplevel.pl, but to let every program that needs specialized toplevels to flexibly specify the Prolog code and goal that should act as the toplevel.

@lucksus
Copy link
Contributor Author

lucksus commented Jul 17, 2023

Ok, I've generalized run_toplevel() to take module and key of the predicate to run as parameters. But to enable specifying the whole toplevel file by the user of Machine, I also had to generalize load_top_level() and the constructor. For that I added MachineConfig with default implementation and also reduced copy/paste code in Machine::with_test_streams() to use the new config pattern.

I've also extracted the lib query functions and test into their own file. Also added a new lib_toplevel.pl that is only used there. But I ended up copying over almost all of toplevel.pl. Not sure if this is better than just adding a few predicates to toplevel.pl. But this is definitely more generic and more easily applicable to other similar use-cases.

@lucksus
Copy link
Contributor Author

lucksus commented Jul 17, 2023

BTW, I was basing these changes off of the v0.9.1 commit because the latest head with updated tokio dependency creates problems when used in my project which also pulls in Deno with specific dependencies which work with that old revision of scryer but not the latest head.

@triska
Copy link
Contributor

triska commented Jul 21, 2023

If user-interaction is not needed, a simpler toplevel could suffice. For instance, here is a very rudimentary toplevel that may suffice for your use case:

toplevel :-
        read_term(Goal, [variable_names(VNs)]),
        Goal,
        write('bindings(['),
        write_bindings(VNs),
        write(']).'),
        nl,
        false.
toplevel :- toplevel.

write_bindings([]).
write_bindings([VN|VNs]) :-
        write_bindings_(VNs, VN).

write_bindings_([], VN) :-
        write_binding(VN).
write_bindings_([VN|VNs], Prev) :-
        write_binding(Prev),
        write(','),
        write_bindings_(VNs, VN).

write_binding(Var=Val) :-
        write(Var),
        write(=),
        write_term(Val, [quoted(true),double_quotes(true)]).

Sample interaction:

?- toplevel.
X=3.
bindings([X=3]).
member(X, "abc").
bindings([X=a]).
bindings([X=b]).
bindings([X=c]).

Add halt/0 or emitting no_more_solutions. as needed!

Such a rudimentary toplevel may be provided by the application itself.

@rujialiu
Copy link

rujialiu commented Oct 20, 2023 via email

@triska
Copy link
Contributor

triska commented Oct 20, 2023

Thank you! In any case, the issue appears not to be specific to this PR: It also occurs in #2126 ?

@triska
Copy link
Contributor

triska commented Oct 23, 2023

If I read the CI logs correctly, only the build for the (comparatively dated) 20.04 version of Ubuntu fails, and only for the 32-bit version? If that is the case, then I do not consider the single failing CI build a justification for any delay with merging this PR: As mentioned, #2126 has the same issue, and all other platforms, including the more recent Ubuntu 22.04, seem to build correctly.

Seconding #1880 (comment), I would recommend to first and foremost get the API right, so that (as mentioned above) answers can be accessed by an iterator, because then also infinite sequences of answers can be processed in programs that use this API. (Think of ?- length(Ls, L). etc.).

@Skgland
Copy link
Contributor

Skgland commented Oct 23, 2023

This failure and #2126 appear unrelated to me.

  • Bump rustix from 0.38.14 to 0.38.19 #2126 had a network request fail in the Publish cargo test summary step Request POST /repos/mthom/scryer-prolog/check-runs failed with 403: Forbidden
  • This failed in the Build wasm step as a compiler process got a sigkill (signal: 9, SIGKILL: kill)

Both of these two steps appear to only run for the ubuntu-20.04 x86_64 job, so other jobs can't fail on these steps.

@mthom
Copy link
Owner

mthom commented Oct 25, 2023

SIGKILL usually happens because the OS has run out of memory. Is anyone able to reproduce the issue on their machine after pulling the branch with wasm-pack build --target web -- --no-default-features?

I measured memory usage of rustc on my machine invoked via wasm-pack and at its peak it used 18.4% of system memory, which on my (64 GB) machine, is around 11 - 12 GBs.

@Skgland
Copy link
Contributor

Skgland commented Oct 25, 2023

SIGKILL usually happens because the OS has run out of memory. Is anyone able to reproduce the issue on their machine after pulling the branch with wasm-pack build --target web -- --no-default-features?

I was not able to reproduce it on my Laptop (16GB).

Was on the master branch instead of the PR branch.
With this PR via git fetch upstream pull/1880/head:pr1880 I get a (signal: 15, SIGTERM: termination signal) on my Laptop (16GB).

@infogulch
Copy link
Contributor

Notably, the github actions runners have 7GB of memory:

https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners/about-github-hosted-runners#supported-runners-and-hardware-resources

@rujialiu
Copy link

I'm not familiar with this part of rust, but is it possible to reduce the number of inlines during compilation? There seems to be a loooot of them.

@infogulch
Copy link
Contributor

fwiw, I tried compiling with -j 1 which compiles serially instead of in parallel when possible, but it didn't change the memory usage (which peaked around 6GB on my system with this methodology), it just took longer to complete.

image

# I ran this in another terminal during compilation. It's not scientific at all, but I can't be bothered to try harder when the results are so clear already.
while true; do smem -t -P '(cargo|rustc) ' | tail -1 | awk '{print $5}' >> log.txt; sleep 0.1s; done

@infogulch
Copy link
Contributor

Ok I made some changes on my fork so the CI gets farther, but there are still some issues which I think are related to platform-specific code that doesn't work in wasm32.

In particular, running cargo test on the wasm target fails:

$ cargo test --target wasm32-unknown-unknown --no-default-features --all --verbose
...
error[E0433]: failed to resolve: use of undeclared crate or module `imp`
  --> /home/joe/.cargo/registry/src/index.crates.io-6f17d22bba15001f/wait-timeout-0.2.0/src/lib.rs:66:9
   |
66 |         imp::wait_timeout(self, dur)
   |         ^^^ use of undeclared crate or module `imp`

My guess is that some tests need to be excluded for the wasm target?

@infogulch
Copy link
Contributor

infogulch commented Oct 26, 2023

Changes to CI in #2137 may have fixed the build issues here. (Note: a new build will be triggered if you push any commit or amend the last commit and force push it.)

While developing that pr I found that tests are failing for the wasm build and it probably needs some conditional build tags on tests. See this issue for details:

Fix failing tests on wasm target #2138

@mthom
Copy link
Owner

mthom commented Oct 26, 2023

I can also re-run it as the repo owner, which I'm doing now.

@Skgland
Copy link
Contributor

Skgland commented Oct 26, 2023

I can also re-run it as the repo owner, which I'm doing now.

Wouln't it need to be rebased on master/merged with master first, as CI is testing the state of the PR not the merged state?

@triska
Copy link
Contributor

triska commented Oct 26, 2023

I think the failing build on ubuntu-20.04 nightly is due to f80dff8 not yet being applied in this branch.

@infogulch
Copy link
Contributor

I can also re-run it as the repo owner, which I'm doing now.

Wouln't it need to be rebased on master/merged with master first, as CI is testing the state of the PR not the merged state?

A push to the branch is handled differently than manually triggering a rebuild. Specifically, if triggered by the pull_request: trigger then the branch is tested after being merged with the target, but other triggers (including manual) will build the branch as-is. See: actions/checkout#15

@mthom
Copy link
Owner

mthom commented Oct 26, 2023

I can also re-run it as the repo owner, which I'm doing now.

Wouln't it need to be rebased on master/merged with master first, as CI is testing the state of the PR not the merged state?

Right, of course. My enthusiasm for merging this PR got in the way of my sense.

@lucksus
Copy link
Contributor Author

lucksus commented Nov 2, 2023

I think the failing build on ubuntu-20.04 nightly is due to f80dff8 not yet being applied in this branch.

I've merge master in, which includes that commit. Still failing...

@Skgland
Copy link
Contributor

Skgland commented Nov 2, 2023

I think the failing build on ubuntu-20.04 nightly is due to f80dff8 not yet being applied in this branch.

I've merge master in, which includes that commit. Still failing...

Different Error though, now in step Build release binary:

error[E0599]: no function or associated item named `new_multi_thread` found for struct `tokio::runtime::Builder` in the current scope
  --> src/bin/scryer-prolog.rs:19:44
   |
19 |     let runtime = tokio::runtime::Builder::new_multi_thread()
   |                                            ^^^^^^^^^^^^^^^^
   |                                            |
   |                                            function or associated item not found in `Builder`
   |                                            help: there is an associated function with a similar name: `new_current_thread`

For more information about this error, try `rustc --explain E0599`.
warning: `scryer-prolog` (bin "scryer-prolog") generated 1 warning
error: could not compile `scryer-prolog` (bin "scryer-prolog") due to previous error; 1 warning emitted

instead of target_os = “wasi”
@mthom
Copy link
Owner

mthom commented Nov 2, 2023

Great! If there are no further comments I'll merge and we'll have a new release out in time for the meetup.

@lucksus
Copy link
Contributor Author

lucksus commented Nov 2, 2023

Great! If there are no further comments I'll merge and we'll have a new release out in time for the meetup.

🎉 🥳 🚀

(I didn't get to changing run_query() to return an iterator, but don't think that should block a merge - can be added in a separate, small PR)

@mthom mthom merged commit 575245c into mthom:master Nov 2, 2023
11 checks passed
@bakaq
Copy link
Contributor

bakaq commented Nov 3, 2023

(I didn't get to changing run_query() to return an iterator, but don't think that should block a merge - can be added in a separate, small PR)

Keep in mind what I said about breaking changes before. If the current version gets on crates.io, which is the next logical step, it will have to follow the Rust variant of semver. Changing the signature of run_query() is a breaking change, so if the published version is 0.9.3, after this change it will have to be 0.10.0 instead of 0.9.4. A way to avoid this is to introduce a different run_query_iter() function instead, which would only require a minor version bump, and can also be used to define run_query() if it seems appropriate. Maybe bumping the major version is not such a big deal, but ideally the versions of the binary and the library are synchronized (it's literally the same codebase), so bumping the major version of the library would mean also bumping the version of the binary, which would probably be expected by users to be a big update instead of some random "internal" interface change.

@infogulch
Copy link
Contributor

infogulch commented Nov 4, 2023

I think this could be a good way to do benchmarking with scryer-prolog. See discussion: #1782 With iai 1, which measures performance using Cachegrind by counting instructions and cache misses, this could be viable to set up in CI.

Thoughts?

Example:

benches/edges.rs:

use iai::{black_box, main};

fn iai_benchmark_edges(n: u64) -> u64 {
    let mut machine = Machine::new_lib();
    
    machine.load_module_string("facts", String::from(r#"
      :- use_module(library(clpb)).
      :- use_module(library(assoc)).
      :- use_module(library(lists)).
      
      /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
         Contiguous United States and DC as they appear in SGB:
         http://www-cs-faculty.stanford.edu/~uno/sgb.html
      - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
      
      edge(al, fl).
      edge(al, ga).
      edge(al, ms).
      edge(al, tn).
      edge(ar, la).
      edge(ar, mo).
      edge(ar, ms).
      edge(ar, ok).
      edge(ar, tn).
      edge(ar, tx).
      edge(az, ca).
      edge(az, nm).
      edge(az, nv).
      edge(az, ut).
      edge(ca, nv).
      edge(ca, or).
      edge(co, ks).
      edge(co, ne).
      edge(co, nm).
      edge(co, ok).
      edge(co, ut).
      edge(co, wy).
      edge(ct, ma).
      edge(ct, ny).
      edge(ct, ri).
      edge(dc, md).
      edge(dc, va).
      edge(de, md).
      edge(de, nj).
      edge(de, pa).
      edge(fl, ga).
      edge(ga, nc).
      edge(ga, sc).
      edge(ga, tn).
      edge(ia, il).
      edge(ia, mn).
      edge(ia, mo).
      edge(ia, ne).
      edge(ia, sd).
      edge(ia, wi).
      edge(id, mt).
      edge(id, nv).
      edge(id, or).
      edge(id, ut).
      edge(id, wa).
      edge(id, wy).
      edge(il, in).
      edge(il, ky).
      edge(il, mo).
      edge(il, wi).
      edge(in, ky).
      edge(in, mi).
      edge(in, oh).
      edge(ks, mo).
      edge(ks, ne).
      edge(ks, ok).
      edge(ky, mo).
      edge(ky, oh).
      edge(ky, tn).
      edge(ky, va).
      edge(ky, wv).
      edge(la, ms).
      edge(la, tx).
      edge(ma, nh).
      edge(ma, ny).
      edge(ma, ri).
      edge(ma, vt).
      edge(md, pa).
      edge(md, va).
      edge(md, wv).
      edge(me, nh).
      edge(mi, oh).
      edge(mi, wi).
      edge(mn, nd).
      edge(mn, sd).
      edge(mn, wi).
      edge(mo, ne).
      edge(mo, ok).
      edge(mo, tn).
      edge(ms, tn).
      edge(mt, nd).
      edge(mt, sd).
      edge(mt, wy).
      edge(nc, sc).
      edge(nc, tn).
      edge(nc, va).
      edge(nd, sd).
      edge(ne, sd).
      edge(ne, wy).
      edge(nh, vt).
      edge(nj, ny).
      edge(nj, pa).
      edge(nm, ok).
      edge(nm, tx).
      edge(nv, or).
      edge(nv, ut).
      edge(ny, pa).
      edge(ny, vt).
      edge(oh, pa).
      edge(oh, wv).
      edge(ok, tx).
      edge(or, wa).
      edge(pa, wv).
      edge(sd, wy).
      edge(tn, va).
      edge(ut, wy).
      edge(va, wv).
      
      pairs_keys_values([], [], []).
      pairs_keys_values([A-B|ABs], [A|As], [B|Bs]) :-
              pairs_keys_values(ABs, As, Bs).
      
      independent_set(*(NBs)) :-
              findall(U-V, edge(U, V), Edges),
              setof(U, V^(member(U-V, Edges);member(V-U, Edges)), Nodes),
              pairs_keys_values(Pairs, Nodes, _),
              list_to_assoc(Pairs, Assoc),
              maplist(not_both(Assoc), Edges, NBs).
      
      not_both(Assoc, U-V, ~BU + ~BV) :-
              get_assoc(U, Assoc, BU),
              get_assoc(V, Assoc, BV).
    "#));
    
    let output: QueryResult = machine.run_query(String::from(r#"independent_set(Sat), sat_count(Sat, Count)."#));
    
    assert_eq!(output, QueryResult::Matches(vec![
        QueryMatch::from(btreemap!{
            "Count" => Value::from("217968"),
        }),
    ]));
}

iai::main!(iai_benchmark_edges);

Footnotes

  1. https://bheisler.github.io/criterion.rs/book/iai/iai.html

@alexpetros
Copy link

I think I just discovered this too early before you've had the chance to write it, but as far as I can tell there's no reference to the possibility of using this as an embedded library on the README or the documentation (or this PR). Want to just drop a note of in support of promoting this use-case because it's exactly what I came here looking for, and I think it's a killer feature!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet