Merge pull request #6 from splitgraph/feature/registry_rls

Stage 1 for the registry RLS
splitgraph · Oct 25, 2018 · 716428b · 716428b
2 parents 309bdd7 + e0bd7bc
commit 716428b
Show file tree

Hide file tree

Showing 57 changed files with 1,253 additions and 1,167 deletions.
diff --git a/benchmarking/commit_chain_test.py b/benchmarking/commit_chain_test.py
@@ -56,7 +56,6 @@ def bench_commit_chain_checkout(commits, table_size, update_size):
         update_size = 1000
         commits = 100
 
-
         unmount(conn, MOUNTPOINT)
         init(conn, MOUNTPOINT)
         print("START")

diff --git a/docs/commands.rst b/docs/commands.rst
@@ -17,7 +17,7 @@ Managing images
 
 `sgr checkout`
     checks out a given commit into the schema, first deleting any uncommitted chances. Then,
-    every table in the given Splitgraph image is materialized (copied into the mountpoint as an actual table).
+    every table in the given Splitgraph image is materialized (copied into the repository as an actual table).
 
     As a part of this process, extra physical objects that are required to materialize the image can be downloaded.
 
@@ -31,25 +31,25 @@ There are various (commandline and API) commands that can be used to inspect the
 :mod:`splitgraph.meta_handler` contains more low-level commands that fetch data directly from the metadata
 tables without processing it.
 
-`sgr show MOUNTPOINT IMAGE_HASH`
+`sgr show REPOSITORY IMAGE_HASH`
     Outputs the information about a given image. The verbose mode (`-v`) also lists all the actual objects
     the image depends on.
 
-`sgr diff MOUNTPOINT IMAGE_HASH_1 [IMAGE_HASH_2]`
+`sgr diff REPOSITORY IMAGE_HASH_1 [IMAGE_HASH_2]`
     Also see: :mod:`splitgraph.commands.diff`
 
-    Shows the difference between two images in a mountpoint. If the two images are on the same path in `snap_tree`, it
+    Shows the difference between two images in a repository. If the two images are on the same path in `images`, it
     concatenates their DIFFs and displays that (or the aggregation of total inserts/deletes/updates).
     Note this might give wrong results if there's been a schema change.
 
     If the images are on different branches), it temporarily materializes both revisions and compares them row-by-row.
 
-`sgr log MOUNTPOINT`
+`sgr log REPOSITORY`
     Also see: :func:`splitgraph.commands.misc.get_log`
 
-    Returns the log of changes to a given mountpoint, starting from the current HEAD revision and crawling down.
+    Returns the log of changes to a given repository, starting from the current HEAD revision and crawling down.
     If `--tree` (`-t`) is passed, outputs the full image tree of the schema.
-    Otherwise, and if nothing in the mountpoint is checked out, raises an error.
+    Otherwise, and if nothing in the repository is checked out, raises an error.
 
 `sgr status`
     Lists the currently mounted schemata and their checked out images (if any).
@@ -64,7 +64,7 @@ Also see :mod:`splitgraph.commands.push_pull`
     a full connection string.
 
 `sgr clone`
-    Brings the metadata for the local mountpoint up to date with a remote one, optionally downloading the actual
+    Brings the metadata for the local repository up to date with a remote one, optionally downloading the actual
     physical objects.
 
 `sgr push`
@@ -77,8 +77,8 @@ Also see :mod:`splitgraph.commands.push_pull`
 Importing tables across repositories
 ====================================
 
-`sgr import SOURCE_MOUNTPOINT SOURCE_TABLE TARGET_MOUNTPOINT [TARGET_TABLE] [SOURCE_IMAGE_OR_TAG]`
-    Grafts one or more tables from one mountpoint into another, creating a new single commit on top of the current HEAD.
+`sgr import SOURCE_REPOSITORY SOURCE_TABLE TARGET_REPOSITORY [TARGET_TABLE] [SOURCE_IMAGE_OR_TAG]`
+    Grafts one or more tables from one repository into another, creating a new single commit on top of the current HEAD.
     This doesn't explicitly preserve the imported tables' history. If the new table(s) isn't/aren't materialized, this
     doesn't consume extra space apart from the new entries in the metadata tables. It also doesn't discard any pending
     changes.
@@ -95,15 +95,15 @@ See also :mod:`splitgraph.commands.mounting`.
 
 `sgr mount`
     Uses the Postgres FDW to mount a foreign Postgres/Mongo database as a set of tables into a temporary location
-    and then imports those tables into the target mountpoint as a new Splitgraph image.
+    and then imports those tables into the target repository as a new Splitgraph image.
 
 `sgr unmount`
     Destroys the local copy of a repository and all the metadata related to it in
-    `snap_tree`, `tables`, `remotes` and `snap_tags`. This command doesn't delete the actual physical objects in
+    `images`, `tables`, `remotes` and `snap_tags`. This command doesn't delete the actual physical objects in
     `splitgraph_meta` or references to them in
-    `object_tree` / `object_locations`. There's a separate function, `sgr cleanup`
+    `objects` / `object_locations`. There's a separate function, `sgr cleanup`
     (or :func:`splitgraph.commands.misc.cleanup_objects`) that crawls the `splitgraph_meta` for objects not required
-    by a current mountpoint and does that.
+    by a current repository and does that.
 
 `sgr init`
     Creates an empty repository with one single initial commit (hash `000000...`).
@@ -130,15 +130,15 @@ aren't publicly accessible.
 Provenance tracking allows Splitgraph to recreate the SGFile the image was made with, as well as rebase the image to
 use a different version of the datasets it was made from.
 
-`sgr provenance MOUNTPOINT IMAGE_OR_TAG`
+`sgr provenance REPOSITORY IMAGE_OR_TAG`
     Inspects the image's parents and outputs a list of datasets and their versions
     that were used to create this image (via `IMPORT` or `FROM` commands). If the `-f (--full)` flag is passed, then the
     command will try to reconstruct the full sgfile used to create the image, raising an error if there's a break in the
     provenance chain (e.g. the `MOUNT` command or a SQL query outside of the sgfile interpreter was used somewhere
     in the history of the image). If the `-e` flag is passed, the command will instead stop at the first break in the chain
     and base the resulting sgfile before the break (using the `FROM` command).
 
-`sgr rerun MOUNTPOINT IMAGE_OR_TAG -i DATASET1 IMAGE_OR_TAG1 -i ...`
+`sgr rerun REPOSITORY IMAGE_OR_TAG -i DATASET1 IMAGE_OR_TAG1 -i ...`
     Recreates the SGFile used to derive a given image
     and reruns it, replacing its dependencies as specified by the `-i` options. If the `-u` flag is passed, the image
     is rederived based on the `latest` tag of all its dependencies.

diff --git a/docs/internals.rst b/docs/internals.rst
@@ -14,19 +14,19 @@ version and tag information, relationships between images and downloaded tables.
 
 Here's an overview of the tables in this schema:
 
-  * `snap_tree`: should really be called `image_tree`. Describes all image hashes and their parents, as well as extra
+  * `images`: Describes all image hashes and their parents, as well as extra
     data about a given commit (the creation timestamp, the commit message and the details of the sgfile command that
-    generated this image). PKd on the mountpoint and the image hash, so the same image can exist in multiple schemas
+    generated this image). PKd on the repository and the image hash, so the same image can exist in multiple schemas
     at the same time.
   * `tables`: an image consists of multiple tables. Each table in a given version is represented by one or more objects.
     An object can be one of two types: SNAP (a snapshot, a full copy of the table) and a DIFF (list of changes to a parent
-    object). This is also mountpoint-specific.
-  * `object_tree`: Lists the type and the parent of every object. A SNAP object doesn't have a parent and a DIFF object
+    object). This is also repository-specific.
+  * `objects`: Lists the type and the parent of every object. A SNAP object doesn't have a parent and a DIFF object
     might have multiple parents (for example, the SNAP and the DIFF of a previous commit). This is not necessarily
     the object linked to the parent commit of a given object: if we're importing a table from a different repository,
     we would pull in its chain of DIFF objects without tying them to commits those objects were created in.
   * `remotes`: Currently, stores the connection string for the upstream repository a given repository was cloned from.
-  * `snap_tags`: maps images and their mountpoints to one or more tags. Tags (apart from HEAD) are pushed and pulled
+  * `snap_tags`: maps images and their repositories to one or more tags. Tags (apart from HEAD) are pushed and pulled
     to/from upstream repositories and are immutable (this is weakly enforced by the push/pull code).
     HEAD is a special tag: it points out to the currently checked-out local image.
   * `object_locations`: If a given object is not stored in the remote, this table specifies where to find it (protocol
@@ -56,17 +56,17 @@ Implementation of various Splitgraph commands
     * If there is an update in the audit log that changes the RI (user suspended constraint checking or the tuple had no
       PK and was updated), the update is changed into an insert + delete.
     * All changes are conflated using a straightforward algorithm in `splitgraph.objects.utils.conflate_changes`.
-  * The meta tables this touches are `object_tree` (to register the new objects and link them to their parents),
-    `tables` (to link tables in the new commit to existing/new objects), `snap_tree` (to register the new commit) and
+  * The meta tables this touches are `objects` (to register the new objects and link them to their parents),
+    `tables` (to link tables in the new commit to existing/new objects), `images` (to register the new commit) and
     `snap_tags` (to move the HEAD pointer to the new commit).
 
 `checkout`
 ----------
 
   * The `tables` table is inspected to find out which object is required to start materializing the table.
-  * Then, `object_tree` is crawled to find a chain of DIFF objects that ends with a SNAP
+  * Then, `objects` is crawled to find a chain of DIFF objects that ends with a SNAP
     (`splitgraph.pg_replication.get_closest_parent_snap_object`).
-  * The SNAP is copied into the mountpoint and the DIFFs applied to it. Checkouts/repository clones are
+  * The SNAP is copied into the schema and the DIFFs applied to it. Checkouts/repository clones are
     lazy by default, so an object might not even exist locally. The lookup path for a physical object is:
 
       * Search locally in the `splitgraph_meta` schema for a cached/predownloaded object.
@@ -83,9 +83,9 @@ Implementation of various Splitgraph commands
 `sgr clone` is implemented as follows:
 
   * First, it connect to the remote and inspect its `splitgraph_meta` table to gather the commits, tags and objects
-    (`snap_tree`, `snap_tags`, `object_tree`, `tables` and `object_locations`) that don't exist in the local
+    (`images`, `snap_tags`, `objects`, `tables` and `object_locations`) that don't exist in the local
     `splitgraph_meta`. See `splitgraph.commands.push_pull._get_required_snaps_objects`.
-  * As part of that, also crawl the remote `object_tree` to gather the list of all required objects
+  * As part of that, also crawl the remote `objects` to gather the list of all required objects
     and their dependencies.
   * Optionally, download the new objects and store them in `splitgraph_meta`.
   * Finally, write the new metadata locally. Currently, this command doesn't check for clashes or conflicts, instead
@@ -115,7 +115,7 @@ tags, objects and their locations) on the remote.
 `import`
 ---------
 
-  * Add the new commit into `snap_tree`
+  * Add the new commit into `images`
   * Copy the required rows from `tables` linking the required objects to the new commit (both the tables in the
     current HEAD and the newly imported tables).
   * Change the HEAD pointer to point to the new commit and optionally materialize the new tables (which might involve

diff --git a/docs/sgfile.rst b/docs/sgfile.rst
@@ -21,10 +21,10 @@ The following commands are supported by the interpreter:
 Basing an image on another image
 --------------------------------
 
-`FROM mountpoint[:tag] [AS alias]`
+`FROM repository[:tag] [AS alias]`
     Bases the output of the sgfile on a certain revision of the remote/local repository.
     If `AS alias` is specified, the repository is cloned into `alias` and the current contents of `alias` destroyed.
-    Otherwise, the current output mountpoint (passed to the executor) is used.
+    Otherwise, the current output repository (passed to the executor) is used.
 
 `FROM` can also be used to perform Docker-like multistage builds.
 
@@ -39,31 +39,31 @@ For example::
 Importing tables from another image
 -----------------------------------
 
-`FROM (mountpoint[:tag])/(MOUNT handler conn_string handler_options) IMPORT table1/{query1} [AS table1_alias], [table2/{query2}...]`
-    Uses the `sgr import` command to import one or more tables from either a local mountpoint, a remote one, or an
+`FROM (repository[:tag])/(MOUNT handler conn_string handler_options) IMPORT table1/{query1} [AS table1_alias], [table2/{query2}...]`
+    Uses the `sgr import` command to import one or more tables from either a local repository, a remote one, or an
     FDW-mounted database.
 
 Optionally, the table name can be replaced with a SELECT query in curly braces that will get executed against the
-source mountpoint in order to create a table. This will be stored as a snapshot. For example:
+source repository in order to create a table. This will be stored as a snapshot. For example:
 
 `FROM internal_data:latest IMPORT {SELECT name, age FROM staff WHERE is_restricted = FALSE} AS visible_staff`
     Will create a new table that contains non-restricted staff names and ages in `internal_data.staff` without including
     any other entries in the table history.
 
     In the case of imports from FDW, the commit hash produced by this command is random. Otherwise, the commit hash will be
-    a combination of the current `OUTPUT` hash, the hash of the source mountpoint and the hashes of the names
+    a combination of the current `OUTPUT` hash, the hash of the source repository and the hashes of the names
     (or source SQL queries) and aliases of all imported tables.
 
 This is crude, but means that the layer is invalidated if there's a change on the remote or we import a different
 table/name it differently/use a different query to create a table.  We can improve on this by perhaps only considering
 the objects and table aliases that are actually imported (as opposed to the source image hash: maybe the tables
-we're importing haven't changed even if other parts of the mountpoint have).
+we're importing haven't changed even if other parts of the repository have).
 
 
 Repository lookups
 ------------------
 
-Currently, a repository name (mountpoint) is converted to a connection string as follows:
+Currently, a repository name is converted to a connection string as follows:
 
   * See if it exists locally (in the case of the sgfile executor). If it does, try to pull it (to update) and
     use it for `FROM`/`IMPORT` commands.
@@ -78,7 +78,7 @@ Running SQL statements
 
 `SQL command`
     Runs a (potentially arbitrary) SQL statement. Doesn't enforce any constraints on the SQL yet,
-    but the spirit of this command is performing actions on tables in the current `OUTPUT` mountpoint (the command is
+    but the spirit of this command is performing actions on tables in the current `OUTPUT` repository (the command is
     executed with the `OUTPUT` schema being the default one) and not changing/reading data from any other schemas.
 
 The image hash produced by this command is a combination of the current `OUTPUT` hash and the hash of the

diff --git a/setup.py b/setup.py
@@ -20,7 +20,7 @@
 ]
 
 setup(
-    name="splitgraph-prototype",
+    name="splitgraph",
     version="0.0",
     packages=['splitgraph'],
     entry_points={