Skip to content

Commit

Permalink
Merge branch 'main' of github.com:Oxen-AI/oxen-release
Browse files Browse the repository at this point in the history
  • Loading branch information
gschoeni committed Jul 15, 2023
2 parents 40628ee + c9a6edf commit ab2c0d8
Show file tree
Hide file tree
Showing 90 changed files with 2,393 additions and 499 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/CI.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ jobs:
name: "macOS arm64"
},
{
os: "macos-13-xl", # try xl?
os: "macos-13",
python-architecture: "x64",
rust-target: "x86_64-apple-darwin",
name: "macOS x64"
Expand Down
55 changes: 48 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ Visit [https://www.oxen.ai/register](https://www.oxen.ai/register) to register f

# Basic Commands

Here is a quick overview of common commands translated to Oxen.
Here is a quick overview of common Oxen commands. If you are familiar with git, this should be an easy learning curve.

## Setup User

Expand All @@ -107,9 +107,50 @@ For your commit history, you will have to set up your local Oxen user name and e
oxen config --name "YOUR_NAME" --email "YOUR_EMAIL"
```

## Create Local Repository
## Clone a Remote Repository

First, create a new directory, navigate into it, and perform
There are a few ways that you can clone an Oxen repository, depending on the level of data transfer you want to incur. The default `oxen clone` with no flags will download the latest commit from the `main` branch.

```bash
oxen clone https://hub.oxen.ai/ox/CatDogBBox
```

To fetch the latest commit from a specific branch you can use the `-b` flag.

```bash
oxen clone https://hub.oxen.ai/ox/CatDogBBox -b my-pets
```

Downloading all the data may still be a more expensive operation than you need. You can download the minimal metadata to still interact with the remote by using the `--shallow` flag.

```bash
oxen clone https://hub.oxen.ai/ox/CatDogBBox --shallow -b my-pets
```

This is especially handy for appending data via the [remote workspace](https://docs.oxen.ai/en/latest/concepts/remote_workspace.html). When downloading by using the `--shallow` flag you will notice no data files in your working directory. You can still see the data on the branch on the remote with the `oxen remote` subcommands.

```bash
# View the remote files
oxen remote ls
```

You can also download a subset by using `oxen remote download` to download subsets of directories or files. This is useful if you only need the testing data and not the full training data files and directories.

```bash
oxen remote download test.csv
```

Lastly, if you want to clone the entire commit history locally, you can use the `--all` flag. This is handy if you want to pull a full history and push to a new remote, or have a workflow where you need to quickly swap between commits locally. Often for running experiments, training, or testing, all you need is a subset of the data.

```bash
oxen clone https://hub.oxen.ai/ox/CatDogBBox --all
```

## Initialize Local Repository

If you do not have a remote dataset, you can initialize one locally.

Similar to git: create a new directory, navigate into it, and perform

```bash
oxen init
Expand Down Expand Up @@ -308,11 +349,11 @@ cd $REPO_NAME
oxen pull origin my-branch
```

## Remote Staging
## Remote Workspace

There are times when you do not want to clone the entire repository to make a change. For example, if you have a large dataset and you want to add one annotation, it is very inefficient to clone all the files locally.

Oxen's remote staging area helps enable a more efficient workflow. Simply add the `oxen remote` subcommand to the commands you already know how to use locally.
You can think of Oxen's remote workspace as mirroring your local workspace, but without all the files downloaded. It should feel like you are interacting locally when really all the action is on the server. Simply add the `oxen remote` subcommand to the commands you already know how to use locally.

Let's walk through an example. Start by shallow cloning a repo and a checkout a specific branch.

Expand All @@ -326,7 +367,7 @@ If you do a quick `ls` you will see that there are no files locally. Never fear,
oxen remote status
```

This checks the remote staging area on this branch to see if you have any remote files staged. You can then proceed to `add` and `commit` changes without ever having to clone the entire dataset.
This checks the remote workspace on this branch to see if you have any remote files staged. You can then proceed to `add` and `commit` changes without ever having to clone the entire dataset.

```bash
oxen remote add image.png
Expand All @@ -336,7 +377,7 @@ oxen remote add image.png
oxen remote status
```

For more information about remote staging, refer to the [remote staging documentation](https://github.com/Oxen-AI/oxen-release/blob/main/RemoteStaging.md).
For more information about Oxen's remote workspaces, refer to the [remote workspace documentation](https://docs.oxen.ai/en/latest/concepts/remote_workspace.html).

## Oxen Badges

Expand Down
6 changes: 3 additions & 3 deletions oxen/Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions oxen/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "oxen"
version = "0.1.19"
version = "0.1.21"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
Expand All @@ -15,7 +15,7 @@ log = "0.4.17"
pyo3-log = "0.8.1"
tokio = { version = "1", features = ["full"] }
pyo3-polars = "0.4.1"
liboxen = "0.6.4"
liboxen = "0.7.2"

[build-dependencies]
cc = { version = "1.0", features = ["parallel"] }
Expand Down
Binary file modified oxen/docs/build/doctrees/concepts/data_frames.doctree
Binary file not shown.
Binary file modified oxen/docs/build/doctrees/concepts/embedding_search.doctree
Binary file not shown.
Binary file modified oxen/docs/build/doctrees/concepts/remote_staging.doctree
Binary file not shown.
Binary file not shown.
Binary file modified oxen/docs/build/doctrees/contributing/documentation.doctree
Binary file not shown.
Binary file modified oxen/docs/build/doctrees/contributing/python.doctree
Binary file not shown.
Binary file modified oxen/docs/build/doctrees/contributing/rust.doctree
Binary file not shown.
Binary file modified oxen/docs/build/doctrees/environment.pickle
Binary file not shown.
Binary file modified oxen/docs/build/doctrees/getting_started/commands.doctree
Binary file not shown.
Binary file modified oxen/docs/build/doctrees/getting_started/installation.doctree
Binary file not shown.
Binary file modified oxen/docs/build/doctrees/getting_started/python.doctree
Binary file not shown.
Binary file modified oxen/docs/build/doctrees/getting_started/tutorials.doctree
Binary file not shown.
Binary file modified oxen/docs/build/doctrees/index.doctree
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file modified oxen/docs/build/doctrees/references/python/local_repo.doctree
Binary file not shown.
Binary file modified oxen/docs/build/doctrees/references/python/remote_repo.doctree
Binary file not shown.
Binary file modified oxen/docs/build/doctrees/references/rust.doctree
Binary file not shown.
2 changes: 1 addition & 1 deletion oxen/docs/build/html/.buildinfo
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: c9ea93225dab131b400fe9803b2be122
config: 6055e06166dfe4ae5a0187593e29aa85
tags: 645f666f9bcd5a90fca523b33c5a78b7
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# Remote Staging Workflow
# Remote Workspaces

Oxen has the concept of a "remote staging area" to enable easy data collection and labeling workflows. There are two main types of data you might want to stage.
Oxen has the concept of a "remote workspace" to enable easy data collection and labeling workflows. There are two main types of data you might want to stage.

1) Unstructured data files (images, videos, audio, text)
2) Structured annotations (rows for tabular data frames)

Instead of cloning the entire dataset locally (which can take a lot of time, bandwidth, and storage) you can stage data directly on the remote server.

The commands you are used to working with in your local workspace (`status`, `add`, `commit`, etc...) now work with the remote staging area. Each user's changes are sand-boxed to their own identity, so when you add to a remote staging workspace, it will not overlap with other users.
The commands you are used to working with in your local workspace (`status`, `add`, `commit`, etc...) now work with the remote workspace. Each user's changes are sand-boxed to their own identity, so when you add to a remote workspace, it will not overlap with other users.

# Staging Files

Expand All @@ -27,7 +27,7 @@ Note: When you do a shallow clone, your local commands will not work until you `

## Create Remote Branch

After you have a shallow clone, then you can create a local branch, and push it to the remote. Every remote branch has a remote staging area that is tied to the branch.
After you have a shallow clone, then you can create a local branch, and push it to the remote. Every remote branch has a remote workspace that is tied to the branch.

```bash
$ oxen checkout -b add-images
Expand All @@ -36,15 +36,15 @@ $ oxen push origin add-images

## Check Remote Status

Now that you have created a remote branch, you can interact with the remote staging area with the `oxen remote` subcommand. The oxen remote subcommand defaults to checking the current branch you are on but on the remote server.
Now that you have created a remote branch, you can interact with the remote workspace with the `oxen remote` subcommand. The oxen remote subcommand defaults to checking the current branch you are on but on the remote server.

```bash
$ oxen remote status
```

## Remote Add File

To add a file to the remote staging area simply use `oxen remote add`.
To add a file to the remote workspace simply use `oxen remote add`.

```bash
$ oxen remote add image.jpg
Expand All @@ -55,13 +55,13 @@ For relative paths, oxen will mirror the directory structure you have locally.
```bash
$ mkdir my-images/ # create local dir
$ cp /path/to/image.jpg my-images/ # add image to local dir
$ oxen remote add my-images/image.jpg # upload image to remote staging area in the my-images/ directory
$ oxen remote add my-images/image.jpg # upload image to the remote workspace in the my-images/ directory
```

For absolute paths to a file, you will also need to specify the path you would like to put it in with the `-p` flag.

```bash
$ oxen remote add /path/to/image.jpg -p my-images # upload image to remote staging area
$ oxen remote add /path/to/image.jpg -p my-images # upload image to the remote workspace
```

You can now use the `oxen remote status` command to see the files that are staged on the remote branch.
Expand All @@ -81,9 +81,9 @@ Files to be committed:

## Delete Remotely Added File

If you accidentally add file from the remote staging area and want to remove it, no worries, you can unstage it with `oxen remote rm`.
If you accidentally add file from the remote workspace and want to remove it, no worries, you can unstage it with `oxen remote rm`.

(TODO: right now the functionality only operates on staging area regardless of the --staged flag, we might want to allow remote removing of files and directories).
(TODO: right now the functionality only operates on workspace regardless of the --staged flag, we might want to allow remote removing of files and directories).

```bash
$ oxen remote rm --staged my-images/image.jpg
Expand Down Expand Up @@ -158,7 +158,7 @@ shape: (1, 7)
└──────────────────────────────────┴──────────────────────┴───────┴───────┴───────┴───────┴────────┘
```

This returns a unique ID for the row that we can use as a handle to interact with the specific row in the remote staging area. To list the added rows on the dataframe you can use the `oxen remote diff` command.
This returns a unique ID for the row that we can use as a handle to interact with the specific row in the remote workspace. To list the added rows on the dataframe you can use the `oxen remote diff` command.

```bash
$ oxen remote diff annotations/train.csv
Expand Down
24 changes: 12 additions & 12 deletions oxen/docs/build/html/_sources/getting_started/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,27 +19,27 @@ brew install oxen
### Ubuntu Latest

```bash
wget https://github.com/Oxen-AI/Oxen/releases/download/v0.6.1+1/oxen-ubuntu-latest-0.6.1+1.deb
wget https://github.com/Oxen-AI/Oxen/releases/download/v0.7.1/oxen-ubuntu-latest-0.7.1.deb
```

```bash
sudo dpkg -i oxen-ubuntu-latest-0.6.1+1.deb
sudo dpkg -i oxen-ubuntu-latest-0.7.1.deb
```

### Ubuntu 20.04

```bash
wget https://github.com/Oxen-AI/Oxen/releases/download/v0.6.1+1/oxen-ubuntu-20.04-0.6.1+1.deb
wget https://github.com/Oxen-AI/Oxen/releases/download/v0.7.1/oxen-ubuntu-20.04-0.7.1.deb
```

```bash
sudo dpkg -i oxen-ubuntu-20.04-0.6.1+1.deb
sudo dpkg -i oxen-ubuntu-20.04-0.7.1.deb
```

### Windows

```bash
wget https://github.com/Oxen-AI/Oxen/releases/download/v0.6.1+1/oxen.exe
wget https://github.com/Oxen-AI/Oxen/releases/download/v0.7.1/oxen.exe
```

## Server Install
Expand All @@ -59,11 +59,11 @@ brew install oxen-server
### Docker

```bash
wget https://github.com/Oxen-AI/Oxen/releases/download/v0.6.1+1/oxen-server-docker-0.6.1+1.tar
wget https://github.com/Oxen-AI/Oxen/releases/download/v0.7.1/oxen-server-docker-0.7.1.tar
```

```bash
docker load < oxen-server-docker-0.6.1+1.tar
docker load < oxen-server-docker-0.7.1.tar
```

```bash
Expand All @@ -73,27 +73,27 @@ docker run -d -v /var/oxen/data:/var/oxen/data -p 80:3001 oxen/oxen-server:lates
### Ubuntu Latest

```bash
wget https://github.com/Oxen-AI/Oxen/releases/download/v0.6.1+1/oxen-server-ubuntu-latest-0.6.1+1.deb
wget https://github.com/Oxen-AI/Oxen/releases/download/v0.7.1/oxen-server-ubuntu-latest-0.7.1.deb
```

```bash
sudo dpkg -i oxen-server-ubuntu-latest-0.6.1+1.deb
sudo dpkg -i oxen-server-ubuntu-latest-0.7.1.deb
```

### Ubuntu 20.04

```bash
wget https://github.com/Oxen-AI/Oxen/releases/download/v0.6.1+1/oxen-server-ubuntu-20.04-0.6.1+1.deb
wget https://github.com/Oxen-AI/Oxen/releases/download/v0.7.1/oxen-server-ubuntu-20.04-0.7.1.deb
```

```bash
sudo dpkg -i oxen-server-ubuntu-20.04-0.6.1+1.deb
sudo dpkg -i oxen-server-ubuntu-20.04-0.7.1.deb
```

### Windows

```bash
wget https://github.com/Oxen-AI/Oxen/releases/download/v0.6.1+1/oxen-server.exe
wget https://github.com/Oxen-AI/Oxen/releases/download/v0.7.1/oxen-server.exe
```

To get up and running using the client and server, you can follow the [getting started docs](https://github.com/Oxen-AI/oxen-release).
5 changes: 2 additions & 3 deletions oxen/docs/build/html/_sources/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,7 @@
:hidden:

concepts/data_frames.md
concepts/remote_staging.md
concepts/embedding_search.md
concepts/remote_workspace.md

.. toctree::
:maxdepth: 2
Expand All @@ -35,7 +34,7 @@

references/python/local_repo.rst
references/python/remote_repo.rst
references/python/data_loaders/index
references/python/data_loaders/index.rst

.. toctree::
:maxdepth: 2
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
:maxdepth: 2
:name: data_loaders

overview.md
image_loader.rst
chat_loader.rst
regression_loader.rst
Loading

0 comments on commit ab2c0d8

Please sign in to comment.