Skip to content

Commit

Permalink
Merge branch '1.0.x' into 'master'
Browse files Browse the repository at this point in the history
  • Loading branch information
remram44 committed Dec 17, 2019
2 parents f4e3b31 + 9402ff7 commit a4b459c
Show file tree
Hide file tree
Showing 8 changed files with 26 additions and 3 deletions.
2 changes: 1 addition & 1 deletion .travis/install.sh
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ in
sudo apt-get update -qq
sudo apt-get install -qq libc6-dev-i386 gcc-multilib
if [ $TEST_MODE = "coverage" ]; then
pip install coverage codecov
pip install 'coverage<5' codecov
pip install -e ./reprozip -e ./reprounzip -e ./reprounzip-docker -e ./reprounzip-vagrant -e ./reprounzip-vistrails -e ./reprounzip-qt -e ./reprozip-jupyter
else
pip install ./reprozip ./reprounzip ./reprounzip-docker ./reprounzip-vagrant ./reprounzip-vistrails ./reprounzip-qt -e ./reprozip-jupyter
Expand Down
13 changes: 13 additions & 0 deletions docs/developerguide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,19 @@ Continuous testing is provided by `Travis CI <https://travis-ci.org/ViDA-NYU/rep

If you have any questions or need help with the development of an unpacker or plugin, please use our development mailing-list at `dev@reprozip.org <https://vgc.poly.edu/mailman/listinfo/reprozip-users>`__.

Introduction to ReproZip
------------------------

ReproZip works in two steps: tracing and packing. Under the hood, tracing is two separate steps, leading to the following workflow:

* Running the experiment under trace. During this part, the experiment is running, and the ``_pytracer`` C extension watches it through the `ptrace` mechanism, recording information in the trace SQLite3 database (``.reprozip-trace/trace.sqlite3``). This database contains raw information as it is recorded and does little else, leaving that to the next step. This part is referred to as the "C tracer".
* After the experiment is done, some additional information is computed by the Python code to generate the configuration file, by looking at the trace database and the filesystem. For example, all accesses to a file are aggregated to decide if it is read or written by the overall experiment, if it is an input or output file, resolve symlinks, etc. Additional information is written such as OS information and which distribution package each file comes from.
* Packing reads the configuration file to create the ``.rpz`` bundle, which includes a configuration file (re-written into a "canonical" version), the trace database (though it is not read at this step), and the files listed in the configuration which was possibly altered by the user.

Therefore it is important to note that the configuration file and the trace database contain distinct information, and although the configuration is inferred from the database, it contains some additional details that was obtained from the original machine afterwards.

Only the configuration file should be necessary to run unpackers. The trace database is included for information, and to support additional commands like ``reprounzip graph`` (:ref:`graph`).

Writing Unpackers
-----------------

Expand Down
4 changes: 4 additions & 0 deletions docs/traceschema.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,10 @@ Each entry in the ``processes`` table has the id of its parent, i.e. the process

CREATE TABLE processes(
id INTEGER NOT NULL PRIMARY KEY,
run_id INTEGER NOT NULL,
parent INTEGER,
timestamp INTEGER NOT NULL,
is_thread BOOLEAN NOT NULL,
exitcode INTEGER
);

Expand All @@ -34,6 +36,7 @@ Each file has a numerical id, the canonical path name, the process that accessed

CREATE TABLE opened_files(
id INTEGER NOT NULL PRIMARY KEY,
run_id INTEGER NOT NULL,
name TEXT NOT NULL,
timestamp INTEGER NOT NULL,
mode INTEGER NOT NULL,
Expand All @@ -59,6 +62,7 @@ This is a variant of ``opened_files`` for file executions, i.e. `execve(2) <http
CREATE TABLE executed_files(
id INTEGER NOT NULL PRIMARY KEY,
name TEXT NOT NULL,
run_id INTEGER NOT NULL,
timestamp INTEGER NOT NULL,
process INTEGER NOT NULL,
argv TEXT NOT NULL,
Expand Down
1 change: 1 addition & 0 deletions reprounzip/reprounzip/unpackers/graph.py
Original file line number Diff line number Diff line change
Expand Up @@ -299,6 +299,7 @@ def read_events(database, all_forks, has_thread_flag, has_exit_timestamp):
# doesn't do anything worth showing on the graph, it will be erased, unless
# all_forks is True (--all-forks).

assert database.is_file()
if PY3:
# On PY3, connect() only accepts unicode
conn = sqlite3.connect(str(database))
Expand Down
1 change: 1 addition & 0 deletions reprounzip/reprounzip/unpackers/provviewer.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ def generate(target, configfile, database):

has_thread_flag = config.format_version >= LooseVersion('0.7')

assert database.is_file()
if PY3:
# On PY3, connect() only accepts unicode
conn = sqlite3.connect(str(database))
Expand Down
6 changes: 4 additions & 2 deletions reprozip/native/syscalls.c
Original file line number Diff line number Diff line change
Expand Up @@ -433,7 +433,8 @@ static int record_shebangs(struct Process *process, const char *exec_target)
"Linux will not give the process any "
"privileges from set-uid while it is being "
"traced. This will probably break whatever "
"you are tracing.");
"you are tracing. Executable: %s",
exec_target);
}
else
{
Expand Down Expand Up @@ -481,7 +482,8 @@ static int record_shebangs(struct Process *process, const char *exec_target)
"Linux will not give the process any "
"privileges from set-gid while it is being "
"traced. This will probably break whatever "
"you are tracing.");
"you are tracing. Executable: %s",
exec_target);
}
else
{
Expand Down
1 change: 1 addition & 0 deletions reprozip/reprozip/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ def shell_escape(s):
def print_db(database):
"""Prints out database content.
"""
assert database.is_file()
if PY3:
# On PY3, connect() only accepts unicode
conn = sqlite3.connect(str(database))
Expand Down
1 change: 1 addition & 0 deletions reprozip/reprozip/tracer/trace.py
Original file line number Diff line number Diff line change
Expand Up @@ -370,6 +370,7 @@ def write_configuration(directory, sort_packages, find_inputs_outputs,
"""
database = directory / 'trace.sqlite3'

assert database.is_file()
if PY3:
# On PY3, connect() only accepts unicode
conn = sqlite3.connect(str(database))
Expand Down

0 comments on commit a4b459c

Please sign in to comment.