Skip to content
Merged
1,644 changes: 52 additions & 1,592 deletions README.md

Large diffs are not rendered by default.

16 changes: 16 additions & 0 deletions docs/code-of-conduct.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,19 @@
<!--
Copyright 2025 Google LLC

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

# Code of Conduct

## Our Pledge
Expand Down
16 changes: 16 additions & 0 deletions docs/contributing.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,19 @@
<!--
Copyright 2025 Google LLC

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

# How to Contribute

We would love to accept your patches and contributions to this project.
Expand Down
116 changes: 116 additions & 0 deletions docs/installation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
<!--
Copyright 2025 Google LLC

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

# Installation

There are 2 ways to install XPK:

- via Python package installer (`pip`),
- clone from git and build from source.

## Prerequisites

The following tools must be installed:

- python >= 3.10: download from [here](https://www.python.org/downloads/)
- pip: [installation instructions](https://pip.pypa.io/en/stable/installation/)
- python venv: [installation instructions](https://virtualenv.pypa.io/en/latest/installation.html)
(all three of above can be installed at once from [here](https://packaging.python.org/en/latest/guides/installing-using-linux-tools/#installing-pip-setuptools-wheel-with-linux-package-managers))
- gcloud: install from [here](https://cloud.google.com/sdk/gcloud#download_and_install_the) and then:
- Run `gcloud init`
- [Authenticate](https://cloud.google.com/sdk/gcloud/reference/auth/application-default/login) to Google Cloud
- kubectl: install from [here](https://cloud.google.com/kubernetes-engine/docs/how-to/cluster-access-for-kubectl#install_kubectl) and then:
- Install `gke-gcloud-auth-plugin` from [here](https://cloud.google.com/kubernetes-engine/docs/how-to/cluster-access-for-kubectl#install_plugin)
- docker: [installation instructions](https://docs.docker.com/engine/install/) and then:
- Configure sudoless docker: [guide](https://docs.docker.com/engine/install/linux-postinstall/)
- Run `gcloud auth configure-docker` to ensure images can be uploaded to registry

### Additional prerequisites when installing from pip

- kueuectl: install from [here](https://kueue.sigs.k8s.io/docs/reference/kubectl-kueue/installation/)
- kjob: installation instructions [here](https://github.com/kubernetes-sigs/kjob/blob/main/docs/installation.md)

### Additional prerequisites when installing from source

- git: [installation instructions](https://git-scm.com/downloads/linux)
- make: install by running `apt-get -y install make` (`sudo` might be required)

### Additional prerequisites to enable bash completion

- Install [argcomplete](https://pypi.org/project/argcomplete/) globally on your machine.
```shell
pip install argcomplete
activate-global-python-argcomplete
```
- Configure `argcomplete` for XPK.
```shell
eval "$(register-python-argcomplete xpk)"
```

## Installation via pip

To install XPK using pip, first install required tools mentioned in [prerequisites](#prerequisites) and [additional prerequisites](#additional-prerequisites-when-installing-from-pip). Then you can install XPK simply by running:

```shell
pip install xpk
```

If you see an error saying: `This environment is externally managed`, please use a virtual environment. For example:

```shell
# One time step of creating the virtual environment
VENV_DIR=~/venvp3
python3 -m venv $VENV_DIR

# Activate your virtual environment
source $VENV_DIR/bin/activate

# Install XPK in virtual environment using pip
pip install xpk
```

## Installation from source

To install XPK from source, first install required tools mentioned in [prerequisites](#prerequisites) and [additional prerequisites](#additional-prerequisites-when-installing-from-source). Afterwards you can install XPK from source using `make`

```shell
# Clone the XPK repository
git clone https://github.com/google/xpk.git
cd xpk

# Install required dependencies and build XPK with make
make install && export PATH=$PATH:$PWD/bin
```

If you want the dependecies to be available in your PATH please run: `echo $PWD/bin` and add its value to `PATH` in .bashrc or .zshrc file.

If you see an error saying: `This environment is externally managed`, please use a virtual environment. For example:

```shell
# One time step of creating the virtual environment
VENV_DIR=~/venvp3
python3 -m venv $VENV_DIR

# Activate your virtual environment
source $VENV_DIR/bin/activate

# Clone the XPK repository
git clone https://github.com/google/xpk.git
cd xpk

# Install required dependencies and build XPK with make
make install && export PATH=$PATH:$PWD/bin
```
61 changes: 61 additions & 0 deletions docs/local_testing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
<!--
Copyright 2025 Google LLC

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

# Local testing with Kind

To facilitate development and testing locally, we have integrated support for testing with `kind`. This enables you to simulate a Kubernetes environment on your local machine.

## Prerequisites

- Install kind on your local machine. Follow the official documentation here: [Kind Installation Guide.](https://kind.sigs.k8s.io/docs/user/quick-start#installation)

## Usage

xpk interfaces seamlessly with kind to manage Kubernetes clusters locally, facilitating the orchestration and management of workloads. Below are the commands for managing clusters:

### Cluster Create
* Cluster create:

```shell
xpk kind create \
--cluster xpk-test
```

### Cluster Delete
* Cluster Delete:

```shell
xpk kind delete \
--cluster xpk-test
```

### Cluster List
* Cluster List:

```shell
xpk kind list
```

## Local Testing Basics

Local testing is available exclusively through the `batch` and `job` commands of xpk with the `--kind-cluster` flag. This allows you to simulate training jobs locally:

```shell
python xpk.py batch [other-options] --kind-cluster script
```

Please note that all other xpk subcommands are intended for use with cloud systems on Google Cloud Engine (GCE) and don't support local testing. This includes commands like cluster, info, inspector, etc.

27 changes: 27 additions & 0 deletions docs/permissions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
<!--
Copyright 2025 Google LLC

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

# Permissions needed on Cloud Console:

* Artifact Registry Writer
* Compute Admin
* Kubernetes Engine Admin
* Logging Admin
* Monitoring Admin
* Service Account User
* Storage Admin
* Vertex AI Administrator
* Filestore Editor (This role is neccessary if you want to run `storage create` command with `--type=gcpfilestore`)
Loading
Loading