# SDK Reference Files - `files()` - write

Ocean Data Platform offers both API and Python SDK interfaces. This notebook highlights the Python SDK.

## Installation

If you are not working in the [ODP Workspace](https://workspace.hubocean.earth/), you need to first install the Python SDK package.

```bash
pip install -U odp-sdk
```

## Client Initialization

In [None]:
# Import the Client class from the odp.client module
from odp.client import Client

# Create an instance of the Client class
client = Client()

If you are using our [ODP Workspaces](https://workspace.hubocean.earth/) you are automatically authenticated, but if you are working outside the initiation of the Client will open a browser to performance authentication process.
If you are working outside ODP Workspaces and you don't want to open the browser to authenticate you can use API Key authentication.
You can generate an API key in the Ocean Data Platform web interface, under your user profile.
```python
client = Client(api_key="your-api-key")
````

## Dataset Access

With an initialized `Client` you can access different datasets by using the datasets' UUID. The easiest way is to use the [ODP Catalog](https://app.hubocean.earth/catalog) to search for datasets and find the UUID (click API).

## Upload a file to dataset

In order to upload a file to a dataset, you first need to have or create a dataset. Go to https://app.hubocean.earth/my_data and press "Add New Dataset", fill in the name, description (and optional tags). On the Dataset admin page you can see the UUID. 

In [None]:
# Get dataset
dataset = client.dataset("my-dataset-id") # Exchange this with your own UUID

### upload()

Here we use a file found in the [GitHub repo](https://github.com/C4IROcean/OceanDataPlatform) calles `Zooplankton_Chlorophyll_Distribution.csvd and we save with the same name.

In [45]:
# Upload to Dataset a CSV from file
with open("Zooplankton_Chlorophyll_Distribution.csv", "rb") as f:
    file_id = dataset.files.upload("Zooplankton_Chlorophyll_Distribution.csv", f)

### update_meta()

The files have metadata that you can create and update through the SDK. The standard schema consisting of `id`, `name`, `created`, `updated`, `size`, `format`, `mimetype`, `geometry`. In addition you can create your custom metadata in `properties`.


In [None]:
# Define custom metadata to be added to the file
custom_meta = {
    "value" : 2000,
    "description" : "Measurements from boat"
}

# Update the file metadata with a new name and custom properties
dataset.files.update_meta(
    file_id,
    {
        "name" : "Zooplankton and Chlorophyll Distribution in Southampton Water",
        "properties" : custom_meta
    }
)

## ingest()

If you prefer to work with your tabular data in a table, you can easily do so using `ingest()`. Currently we support the following formats: CSV and Parquet/GeoParquet. The ingest creates a table with the same schema as the file. If the table already exists and the schema is the same (name, type), the data from the file is appended as default.

In [None]:
# Ingest the uploaded file into the dataset
dataset.files.ingest(file_id) 

In [None]:
# Query all data and convert to pandas DataFrame
df = dataset.table.select().all().dataframe()
# Display the first few rows of the DataFrame
df.head()

## delete()

If you want to delete a file you to that by using `delete()`and passing the `id`of the file.

In [49]:
# Get the medatadata for the Files and pick the if of the first file (in this example you could also just use your existing file_id if you want to delete that file
files = dataset.files.list() # See sdk_reference_files_read.ipynb to learn about how to read metadata of files
file_id = str(files[0]['id'])

In [50]:
# Delete that file
dataset.files.delete(id = file_id)