## Preface

One of the pivotal projects the EASIER Data Initiative has produced is [ipfs-stac](https://pypi.org/project/ipfs-stac/). The Python library is a testament to the feasibility of onboarding and interfacing geospatial data on IPFS. The library enables developers and researchers to leverage STAC APIs enriched with Filecoin and IPFS metadata to seamlessly fetch, pin, and explore data in a familiar manner. In an ambiguous ecosystem with everchanging advancements, updates, breaking changes, and new features of the infrastructure will emerge. The team has made it a responsibility to adhere to these changes and that has prompted our projects to remain flexible. This notebook will explore the many new features and changes to ipfs-stac in version 0.2. 


### Changes Summary

1. When fetching content, the file size is now human-readable (progress is now tracked in Megabytes)
2. New search functionality via `searchSTAC` method added to the `web3` client class. Returns a collection of items
   1. A user can now pass in many of the [query parameters options](https://github.com/radiantearth/stac-api-spec/tree/release/v1.0.0/item-search#query-parameters-and-fields) to search a STAC catalog
3. Added parameters for content uploads to ipfs:
   1. By default, [CIDv1](https://docs.ipfs.tech/concepts/glossary/#cid-v1) are created
   2. Added option to select whether to pin content to your IPFS node
   3. Added option to add mutable file system (MFS) reference to the content on upload
   4. Added option to provide a filename to content that's uploaded
      1. If a user uploads a file, the filename is extracted. You can override by passing in a value to this parameter.
4. Optimized functions that start and stop ipfs daemon
5. Assets are no longer fetched by default
6. Added `getAssetNames` function to retrieve the asset names from a collection or item
7. New `Web3` class property that automatically grabs all the collection id from the stac endpoint when instantiated.
8. `pinned_list` returns


### Environment Setup
1 - [Install IPFS Kubo CLI](https://docs.ipfs.io/install/ipfs-kubo/) (if you haven't already). This will allow you to run an IPFS node on your local machine.

2 - [Set up a Jupyter Notebook environment](https://www.youtube.com/watch?v=DA6ZAHBPF1U). A convenient method for achieving this is by utilizing the Jupyter integration in Visual Studio Code.

In [60]:
from ipfs_stac.client import Web3, Asset

### Initialize client
easier = Web3(local_gateway="127.0.0.1", stac_endpoint="https://stac.easierdata.info")

### Attributes added to the Web3 class

A couple attributes have been added to the Web3 class which support a deeper understanding of its current configuration and high level exploration of the STAC endpoint:

1. Added `client` attribute - Instance of a Pystac catalog client to support all queries supported by that library
2. Added `collections` attribute - List of collection names found on configured STAC API

In [4]:
easier.client

In [9]:
print(f"Collections: {easier.collections} \n")

# alternatively, you can retrieve a list of collection objects through the new get_collections() method
easier.get_collections()

Collections: ['landsat-c2l1', 'GEDI_L4A_AGB_Density_V2_1_2056.v2.1'] 



[<CollectionClient id=landsat-c2l1>,
 <CollectionClient id=GEDI_L4A_AGB_Density_V2_1_2056.v2.1>]

### Enhancements to search

The team has introduced methods and attributes to the `Web3` class which support searching/exploring a STAC catalog that may not be entirely managed by the user. With the following additions, the user experience of being able to query and index unknown assets has been improved:

1. Added `searchSTAC` method to Web3 class - Searches through STAC catalog leveraging the pystac-client attribute, effectively allowing one to use the same exact parameters.
2. Added `getAssetNames` method to Web3 class - List of asset names given a collection or item

In [31]:
## Usage of searchSTAC method
coordinates = [12.490827, 41.889249, 12.494162, 41.891876]
items = easier.searchSTACByBox(coordinates,"landsat-c2l1")

print(f"ID from bbox method: {items[0].id}")

# Or alternatively:
items = easier.searchSTAC(bbox=coordinates)
print(f"ID from STAC method: {items[0].id}")

ID from bbox method: LC09_L1TP_191031_20220202_20220202_02_T1
ID from STAC method: LC09_L1TP_191031_20220202_20220202_02_T1


In [45]:
## Usage of getAssetNames method
asset_names = easier.getAssetNames(items[0])
asset_names

['SAA',
 'SZA',
 'VAA',
 'VZA',
 'pan',
 'red',
 'blue',
 'green',
 'index',
 'nir08',
 'cirrus',
 'lwir11',
 'lwir12',
 'swir16',
 'swir22',
 'ANG.txt',
 'MTL.txt',
 'MTL.xml',
 'coastal',
 'MTL.json',
 'qa_pixel',
 'qa_radsat',
 'thumbnail',
 'reduced_resolution_browse']

### Refactored data fetching

Two critical changes have been made to ipfs-stac which effect the results of the `pinned_list` method and when an instance of an `Asset` are created:

1. The `pinned_list` method would by default retrieve only `recursive` pins, you can specify `direct` or `indirect` through the `pin_type` argument.
2. The `pinned_list` method now has a names argument (boolean) which dictates whether or not to filter pinned content by those that have names.
3. The data associated with an `Asset` object will no longer be fetched by default. To retrieve the data you must call the `fetch` method then access it through the `data` attribute

In [51]:
## Usage of updated pinned_list method
recursive_pins = easier.pinned_list()

indirect_pins = easier.pinned_list(pin_type="indirect", names=False)

print(f"Recursive pins: {len(recursive_pins)}")
print(f"Indirect pins: {len(indirect_pins)}")

Recursive pins: 17
Indirect pins: 635


In [58]:
## Fetching data for an asset
demo_asset = easier.getAssetFromItem(items[0], asset_name="SAA")
print(f"Before: {demo_asset.data}")

demo_asset.fetch()

print(f"After: {len(demo_asset.data)}")

Before: None
✅  Fetching QmViLsVJURYFhz2isxHJUYVd3MFc26Czstma5pow3SnK5d - 1.50/1.50 MB
After: 1574866


In [None]:
# Alternatively, you can force data to be fetched through the fetch_data argument of getAssetFromItem
demo_asset = easier.getAssetFromItem(items[0], asset_name="SAA", fetch_data=True)

### Added ability to write/upload to IPFS Mutable File System

The IPFS Mutable File System (MFS) is a powerful feature to optimize the organization of data stored on the network. 

1. The `uploadToIPFS` method has been updated to support writing to an MFS path
2. The `Asset` class now has an `addToMFS` method which supports writing to a specific directory with the option of specifying a file name.

In [None]:
## Example usage
easier.uploadToIPFS(file_path="./image.tiff", file_name="example.tiff", pin_content=True, mfs_path="images")

demo_asset.addToMFS(filename="blog_post")

## Closing remarks

All in all, the team has been able to produce new features which optimize interfacing with STAC catalogs enriched with IPFS metadata. These changes are a huge step forward in the path of bringing to light the capabilities of decentralized infrastructure when mingled with geospatial data. Stay tuned for more posts which highlight these changes in action. For more technical details, keep an eye out for the [Github Repository](https://github.com/easierdata/ipfs-stac)