important how to add a coq proj to a splits file json files automatically? i.e coq-proj -> coq-proj_splits.append #11

brando90 · 2022-12-14T23:43:32Z

Goal: coq-proj -> coq-proj_splits.append

download (.gitmodules) the specific version of the coq proj we want -- wrt to the right coq ver & ocaml compiler
- likely need switch, compiler, coq ver, git commit for that coq proj
- if we could use the opam remote version best! but we need to know how to coqc compile it through pycoq's strace))
create splits train, test (random 0.9, 0.1 -- ideally based on topological sort, actually divide by proj, some projs are train other test. Simpler + tests harder generalization. In production just train with EVERYTHING)
detect how to build coq proj (or add misisng files), either
- a. opam install opam.lf files & _CoProject (later just has args to coqc & *.v$ files)
- b. make, use the official MakeFile (see debug_proj) with a _CoqProj(just coqc args, *.v$) and uses make clean command cd-ed to the right dir. Might need the opam envs that proverbot hard codes with echo "eval \"$(opam env --set-switch --switch=$SWITCH)\"" >> coq-projects/$project/make.sh
- c. Dune. No idea how to do this.
then given this coq-projs, run pycoq's python data extraction script to get the data we need/want

The text was updated successfully, but these errors were encountered:

brando90 · 2022-12-14T23:47:58Z

for commit: #12

brando90 · 2022-12-14T23:49:04Z

in additions there is no _CoqMake, there is a Make. Might need to port some of these.

brando90 · 2022-12-15T00:15:33Z

to be safe cd to the coq projs (as I did in my python scripts)

git clone the target coq-proj (hoping to replace this can one make a data set without downloading the source files e.g. just downloading from a specific version of the coq-proj from opam install? UCSD-PL/proverbot9001#69), if it's already installable with opam, make clean (or dune) & it works with pycoq'a strace to compile build it then use it & add to coq-projs list, else:
if not installable then detect if inside of it there is:
if coq_proj.lf -> likely nothing to do -- test with strace (likely case outlined above)
if _CoqMake file present in coq proj-> use debug_proj MakeFile -> build_command is: opam switch set SWITCH and then make clean ( what make.sh from proverbot9001 does anyway)
if Make -> _CoqMake -> use debug_proj MakeFile -> build_command is: make clean
if remake file present in coq proj -> do ./remake
if ./configure* file -> do the build inside of make.sh or put this in build/install command in coq_proj.opam (want to remove this when configure x86_64-linux needed? UCSD-PL/proverbot9001#67)
if Dune -> NOP/Error for now (dune installs #13, https://coq.discourse.group/t/a-guide-to-building-your-coq-libraries-and-plugins-with-dune/20)
if coq_proj.opam -> idk need to check what happens here. Do the deps get install automatically to the activated switch? I hope so, it would help and would have to run a pre install script for the deps as done here https://github.com/brando90/proverbot9001/blob/develop/pycoq_install_coqgym_deps.sh

Goal is to make a single coq_proj.opam for all coq-projects from proverbot. Run the proverbot install script and store everything you need into the coq_proj.opam file (and gitmodule? can we get rid of this? UCSD-PL/proverbot9001#68).

coq ver
repo commit (so version of opam coq-proj)
ocaml compiler (& switch e.g. opam switch create coq-8.14.1 4.07.1)
switch (name e.g. oq-8.14.1)
build (& install?) command for coq-proj (make clean hopefully), remove .configure when configure x86_64-linux needed? UCSD-PL/proverbot9001#67 see point above
depends for coq-proj
it has the official MakeFile that my debug_proj has (based on to my knowledge most official instructions to make a coq-proj https://coq.discourse.group/t/official-place-to-learn-how-to-setup-coq-make-files-for-beginner/1682). How to make coq_proj.opam https://discuss.ocaml.org/t/how-does-one-build-a-coq-proj-opam-file-automatically-from-a-successfully-compile-local-version/10962.

then we can "remake" all of coq-gym from proverbot and test the addition of a new coq-proj with the above code that checks the things inside that coq-proj, create a copy/git clones it etc according to the list at the top. End result is a _CoqMake with a coq_proj.lf with all dependencies. Figure out how to remove the configure command "build_command": "./configure.sh && ./remake" in the build from coq-projs in proverbot UCSD-PL/proverbot9001#67.

Note, this makes most of the fields in proverbot's splits.json not needed anymore -- except for the train_files & test_files -- which is all that is needed in those files now. Unless we mark in the coq_proj.opam folder if it's a train or test project in the case no more .json files are needed. If you see my above comment I choose this because it's 1. simpler to build + 2. it tests for a harder generalization setting. You can just re-train on EVERYTHING once you deploy it in practice. Or fine tune + (add the tokens too) on the test scripts so that it knows those files.

brando90 · 2022-12-15T00:32:00Z

splits:

todo: should be split by project or by train, test files? project doesn't need us to worry about topolical sort.
    Tests a harder gen. Let's do this + it's simpler.

note: what is the difference btw build & install in a opam file? https://discuss.ocaml.org/t/what-is-the-difference-between-a-build-command-and-an-install-command-in-an-proj-opam-file/10966

Can opam files have arbitrary stuff? https://discuss.ocaml.org/t/can-i-have-arbitrary-text-fields-in-a-proj-opam-file-can-it-be-converted-to-json-too-if-i-want/10967

brando90 · 2022-12-15T20:46:54Z

Edit to plan:

principles/heuristics:
- 1. pre-prepare all the coq projects, build files needed, coq_proj.opam file with all the info needed, and if I am really forced & can't put it in the .opam file, then a modules file with the commit (https://stackoverflow.com/questions/5542910/how-do-i-commit-changes-in-a-git-submodule)
  - once the coq-projs are done, the builds work & one can create data from them, push EVERYTHING including the coq-projs submodules, .opam files (note build files already there and command in opam file & commit stored and dependencies too in opam file, everything tested with python mega install). This script also prepares the .json file with only the split info (which is optional since I am thinking to split by proj since it's simpler + might have better gen + ML models benfit more by data scale than random hardcoded decisions humans make anyway)
- 1. although likely stuff has to be done manually for each coq-project, we can create a function that takes in a specific coq-proj (not coq-projs nor benchmarks) & tries to do to it's best of it's ability to do step 1

Summary of API

So two main functions (note benchmark == coq_projs):

1. def prepare_single_coq_proj_generate_dummy_data(path2coqproj: str) -> does all of 1 but for 1 coq_proj automatically best effort. If it fails do fixes manually and try to add them to the function to your best of your ability so it works in the future with either a filenames != [] or a try. Must try to execute coq-files too & extract a dummy data e.g. tactic pred/PosEval.
1. def preprepare_everything_from_scratch(path2benchmark: str) -> does step 1 outlined above
- a. do all the prep to generate valid opam files -> tested with a very lightweight data creation loop (only executes script, collects tactics throws them away) -- just to make sure the prep skill works
- b. [optionally calls your real data gen function if you want (passes a function and it's args)]
1. def generate_benchmark_data_assuming_preparation_is_done(path2benchmark: str) -> assume the .opam files, builds are created properly and generates data
1. [Extra?] given a def add_repo_given_git_url(giturl: str): -> does most of step 0 but adds to gitmodules (if that ends up mattering) and to the path to coq-projs etc.

Vocabulary:

benchmark:= the entire data set. In this context the collection of all the coq-projects so the coq-projects dir inside of pycoq.

Files to manage

Goal is to have everything in 1 place if possible. For now I think we can get away with 3 files (hope is 1 file eventually & push to my repo all the time so everything is reproducible in a good state):

coq_proj.opam. Generated by step 0 (& thus 1).
a. has everything. commit, build (build, install?), deps (opam ver, path to _CoqMake/Make/Remake/Makefile, git home page, opam ver, swith name, path2coqporjs, anything else? (looks good!)
git .modules (submodules). Generated by step 0 (& thus 1).
*splits.json. contains train, test splits, created at the end of step 1. (I think this one is optional if we are splitting by coq-proj)

brando90 · 2022-12-15T20:58:56Z

For the sake of an example, here is VPs lf.opam file:

opam-version: "2.0"
maintainer: "vasily.pestun@gmail.com"

homepage: "https://github.com/pestun/lf"
dev-repo: "git+https://github.com/pestun/lf.git"
bug-reports: "https://github.com/pestun/lf/issues"
license: "LGPL-2.1"

synopsis: "Software foundation exercises"
description: """
solutions to software foundations exercises 
"""

version: "dev"

build: [make "-j%{jobs}%"]
install: [make "install"]

depends: [
  "ocaml"
  "coq" {(>= "8.11" & < "8.12~") | (= "dev")}
]

tags: [
  "category:Miscellaneous/Coq Extensions"
  "keyword:integer numbers"
  "keyword:arithmetic"
]
authors: [
  "Benjamin C. Pierce"
]

opam install calls build then install: https://discuss.ocaml.org/t/what-is-the-difference-between-a-build-command-and-an-install-command-in-an-proj-opam-file/10966

extra fields in .opam file: https://opam.ocaml.org/doc/Manual.html#opamfield-extra-fields, https://discuss.ocaml.org/t/can-i-have-arbitrary-text-fields-in-a-proj-opam-file-can-it-be-converted-to-json-too-if-i-want/10967/2

.opam -> .json: idk how yet https://discuss.ocaml.org/t/can-i-have-arbitrary-text-fields-in-a-proj-opam-file-can-it-be-converted-to-json-too-if-i-want/10967/2

brando90 · 2022-12-15T20:59:58Z

(extra, start of from the opam file if it already exists, try to not do extra work if the coq_proj already "works" i.e. installs and I can get data from it)

brando90 · 2022-12-16T02:36:43Z

what about making .opam files automatically for coq?

brando90 · 2023-02-02T23:55:01Z

ruby: https://stackoverflow.com/questions/75330125/why-would-only-using-rbenv-and-ruby-build-work-to-install-ruby-on-ubuntu

brando90 mentioned this issue Dec 15, 2022

is it possible to get ride of the .modules git file and have everything in a coq_proj.opam file? UCSD-PL/proverbot9001#68

Closed

brando90 added the enhancement New feature or request label Dec 15, 2022

brando90 mentioned this issue Dec 15, 2022

figure out a good flow to include more coq-projects into coq-projects dir (as automatically as possible) #15

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

important how to add a coq proj to a splits file json files automatically? i.e coq-proj -> coq-proj_splits.append #11

important how to add a coq proj to a splits file json files automatically? i.e coq-proj -> coq-proj_splits.append #11

brando90 commented Dec 14, 2022 •

edited

Loading

brando90 commented Dec 14, 2022

brando90 commented Dec 14, 2022

brando90 commented Dec 15, 2022 •

edited

Loading

brando90 commented Dec 15, 2022 •

edited

Loading

brando90 commented Dec 15, 2022 •

edited

Loading

brando90 commented Dec 15, 2022 •

edited

Loading

brando90 commented Dec 15, 2022

brando90 commented Dec 16, 2022

brando90 commented Feb 2, 2023

**important** how to add a coq proj to a splits file json files automatically? i.e coq-proj -> coq-proj_splits.append #11

**important** how to add a coq proj to a splits file json files automatically? i.e coq-proj -> coq-proj_splits.append #11

Comments

brando90 commented Dec 14, 2022 • edited Loading

brando90 commented Dec 14, 2022

brando90 commented Dec 14, 2022

brando90 commented Dec 15, 2022 • edited Loading

brando90 commented Dec 15, 2022 • edited Loading

brando90 commented Dec 15, 2022 • edited Loading

Summary of API

Files to manage

brando90 commented Dec 15, 2022 • edited Loading

brando90 commented Dec 15, 2022

brando90 commented Dec 16, 2022

brando90 commented Feb 2, 2023

important how to add a coq proj to a splits file json files automatically? i.e coq-proj -> coq-proj_splits.append #11

important how to add a coq proj to a splits file json files automatically? i.e coq-proj -> coq-proj_splits.append #11

brando90 commented Dec 14, 2022 •

edited

Loading

brando90 commented Dec 15, 2022 •

edited

Loading

brando90 commented Dec 15, 2022 •

edited

Loading

brando90 commented Dec 15, 2022 •

edited

Loading

brando90 commented Dec 15, 2022 •

edited

Loading