# Managing images
Apart from searching and discovering data available to you, catalog enables you to upload new images of your own.

## Creating images
There are two general mechanisms of creating images in the catalog. Upload is the primary mechanism for creating images, either by uploading supported image files types such as GeoTIFF or JPEG, or by uploading image data in the form of a numpy ndarray. The other mechanism is to create “remote” image entries in the catalog without supplying the actual image data.

### Uploading image files
If your data already exists on disk as an image file, usually a GeoTIFF or JPEG file, you can upload it directly.

In the following examples we will upload data with a single band representing the blue light spectrum. First let’s create a product and band corresponding to that:

In [None]:
# Create a product
from descarteslabs.catalog import (
    Band,
    DataType,
    Product,
    Resolution,
    ResolutionUnit,
    SpectralBand,
)
from uuid import uuid4

product = Product(id=f"guide-example-product-{uuid4()}", name="Example product")
product.save()
print(product.id)
# Create a band
band = SpectralBand(name="blue", product=product)
band.data_type = DataType.UINT16
band.data_range = (0, 10000)
band.display_range = (0, 4000)
band.resolution = Resolution(unit=ResolutionUnit.METERS, value=60)
band.band_index = 0
band.save()

Now you create a new image and use `image.upload()` to upload imagery to the new product. This returns a ImageUpload. Images are uploaded and processed asynchronously, so they are not available in the catalog immediately. With `upload.wait_for_completion()` we wait until the upload is completely finished.

In [None]:
from descarteslabs.catalog import Image

# Set any attributes that should be set on the uploaded images
image = Image(product=product, name="image1")
image.acquired = "2012-01-02"
image.cloud_fraction = 0.1
# Do the upload
image_path = "data/blue.tif"
upload = image.upload(image_path)
upload.wait_for_completion()
upload.status

### Uploading ndarrays
Often, when creating derived product - for example, running a classification model on existing data - you’ll have a NumPy array (often referred to as “ndarrays”) in memory instead of a file written to disk. In that case, you can use `upload_ndarray()`. This method behaves like `upload()`, with one key difference: you must provide georeferencing attributes for the ndarray.

Georeferencing attributes are used to map between geospatial coordinates (such as latitude and longitude) and their corresponding pixel coordinates in the array. The required attributes are:

An affine geotransform in GDAL format (the `geotrans` attribute)
A coordinate reference system definition, preferrably as an EPSG code (the `cs_code` attribute) or alternatively as a string in PROJ.4 or WKT format (the `projection` attribute)
If the ndarray you’re uploading was rastered through the the platform, this information is easy to get. When rastering you also receive a dictionary of metadata that includes both of these parameters. Using the `Image.ndarray()`, you have to set `raster_info=True`; with `Raster.ndarray()`, it’s always returned.

The following example puts these pieces together. This extracts the blue band from a Landsat 8 scene at a lower resolution and uploads it to our product:

In [None]:
from descarteslabs.catalog import OverviewResampler

image = Image.get(
    "usgs:landsat:oli-tirs:c2:l2:v0:LC08_L2SP_197031_20230106_20230110_02_T1"
)
ndarray, raster_meta = image.ndarray("blue", resolution=60, raster_info=True)
image2 = Image(product=product, name="image2")
image2.acquired = image.acquired
upload2 = image2.upload_ndarray(
    ndarray,
    raster_meta=raster_meta,
    # create overviews for 120m and 240m resolution
    overviews=[2, 4],
    overview_resampler=OverviewResampler.AVERAGE,
)
upload2.wait_for_completion()
upload2.status

The rastered ndarray here is a three-dimensional array in the shape (band, x, y) - the first axis corresponds to the band number. `upload_ndarray()` expects an array in that shape and will raise a warning if thinks the shape of the array is wrong. If the given array is two-dimensional it will assume you’re uploading a single band image.

This also specifies typically useful values for `overviews` and `overview_resampler`. Overviews allow the platform to raster your image faster at non-native resolutions, at the cost of more storage and a longer initial upload processing time to calculate the overviews.

The overviews argument specifies a list of up to 16 different resolution magnification factors to calulate overviews for. E.g. `overviews=[2,4]` calculates two overviews at 2x and 4x the native resolution. The `overview_resampler` argument specifies the algorithm to use when calculating overviews, see `upload_ndarray()` for which algorithms can be used.

### Updating images
The image created in the previous example is now available in the Catalog. We can look it up and update any of its attributes like any other catalog object:

In [None]:
image2 = Image.get(image2.id)
image2.cloud_fraction = 0.2
image2.save()

To update the underlying file data, you will need to upload a new file or ndarray. However you must utilize a new unsaved Image instance (using the original product id and image name) along with the `overwrite=True` parameter. The reason for this is the original image which is now saved in the catalog contains many computed values, which may be different from those which would be computed from the new upload. There is no way for the catalog to know if you intend to reuse the original values or compute new values for these attributes. Also be aware that using the `overwrite=True` parameter can lead to data cache inconsistencies in the platform which may last a while, so it should be used sparingly with no expectation of seeing the updated data immediately.

## Tags & extra attributes
The image attributes you can set, filter by and sort on are documented on the `Image` class. If you have other structured metadata to attach with your images you can use `extra_properties`:

In [None]:
image2.extra_properties = {
    "processing_time": 120,
    "quality": 0.5,
    "reviewer": "joe@acme.com",
}

image2.save()

`extra_properties` is a dictionary with string keys and values of any type that can be JSON-serialized (booleans, numbers, strings, lists, dictionaries).

Note that you cannot filter or sort images by `extra_properties`. Use `tags` if you have a finite discrete number of custom values you’d like to filter by:

In [None]:
from descarteslabs.catalog import properties as p

image2.tags = ["temporary", "guide"]
image2.save()

# Find all images in the product tagged "temporary"
search = product.images().filter(p.tags == "temporary")
for image in search:
    print(image)