# SDK Reference Files - `files()` - write

Ocean Data Platform offers both API and Python SDK interfaces. This notebook highlights the Python SDK.

## Installation

If you are not working in the ODP Workspaces, you need to first install the Python SDK package

```bash
pip install -U odp-sdk
```

## Client Initialization

In [42]:
from odp.client import Client

In [43]:
client = Client()

Outside ODP Workspces you can use API Key authentication (don't need to open browser).
You can generate an API key in the Ocean Data Platform web interface, under your user profile.
```python
client = Client(api_key="your-api-key")
````

## Upload a file to dataset

In order to upload a file to a dataset, you first need to have or create a dataset. Go to https://app.hubocean.earth/my_data and press "Add New Dataset", fill in the name, description (and optional tags). On the Dataset admin page you can see the UUID. 

In [5]:
dataset = client.dataset("my-dataset-uuid")

### upload()

In [45]:
# Upload to Dataset a CSV from file
with open("Zooplankton_Chlorophyll_Distribution.csv", "rb") as f:
    file_id = dataset.files.upload("Zooplankton_Chlorophyll_Distribution.csv", f)

### update_meta()

The files have metadata that you can create and update through the SDK. The standard schema consisting of `id`, `name`, `created`, `updated`, `size`, `format`, `mimetype`, `geometry`. In addition you can create your custom metadata in `properties`.

In [46]:
# Update file metadata
custom_meta = {
    "value" : 2000,
    "description" : "Measurements from boat"
}

dataset.files.update_meta(
    file_id,
    {
        "name" : "Zooplankton and Chlorophyll Distribution in Southampton Water",
        "properties" : custom_meta
    }
)

{'id': '192e1bc5-1e1',
 'name': 'Zooplankton and Chlorophyll Distribution in Southampton Water',
 'created': '2025-10-31T14:25:20.517000+00:00',
 'updated': '2025-10-31T14:25:20.517000+00:00',
 'size': 31855,
 'format': 'csv',
 'mimetype': 'text/csv',
 'geometry': None,
 'properties': {'value': '2000', 'description': 'Measurements from boat'}}

## ingest()

If you prefer to work with your tabular data in a table, you can easily do so using `ingest()`. Currently we support the following formats: CSV and Parquet/GeoParquet. The ingest creates a table with the same schema as the file. If the table already exists and the schema is the same (name, type), the data from the file is appended as default.

In [47]:
dataset.files.ingest(file_id) 

In [48]:
df = dataset.table.select().all().dataframe()
df.head()

Unnamed: 0,Station,date,time (gmt),lat,long,water depth (m),depth (m),temp (C),sal,fluor,trans,dens,PAR,BATH
0,1,30.10.2014,949,50.54.248,001.23.296,64,25,118518,279455,607,7672,10211443,3501,1927
1,1,30.10.2014,949,50.54.248,001.23.296,64,5,118654,278897,644,16673,10210998,11584,1642
2,1,30.10.2014,949,50.54.248,001.23.296,64,75,119709,289148,705,18466,10218765,71801,1597
3,1,30.10.2014,949,50.54.248,001.23.296,64,1,121656,310134,741,16191,10234688,55204,1648
4,1,30.10.2014,949,50.54.248,001.23.296,64,125,121207,304839,717,15931,10230678,41308,1656


## delete()

If you want to delete a file you to that by using `delete()`and passing the `id`of the file.

In [49]:
# Get the medatadata for the Files and pick the if of the first file (in this example you could also just use your existing file_id if you want to delete that file
files = dataset.files.list() # See sdk_reference_files_read.ipynb to learn about how to read metadata of files
file_id = str(files[0]['id'])

In [50]:
# Delete that file
dataset.files.delete(id = file_id)