Skip to content

PosthumanMarket/Posthuman.py

v2
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code
This branch is 105 commits ahead, 13 commits behind master.

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
DialoGPT
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ocean.py
 
 
 
 
provider
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

banner

PostHuman.py

Posthuman Makes available AI-models-as-a-service using web3 technologies of Ocean Protocol. Model Providers contribute funds to train useful models, and Model Consumers purchase inference and evaluation on the models they find most useful. With Posthuman v1 models (POC), Users can train, infer, and evaluate on any arbitary text data - With v2, we've published 2 commercially useful models on Ocean Market (more below).

Posthuman's decentralised architecture achieves many goals that are impossible with centralised AI providers:

  • Decentralised Model Ownership: Model is owned by the community holders of the datatoken - allows anyone to invest in and profit from useful AI models.
  • Permissionless Development: Fine-tuning advanced AI models is permissioned on Web2 APIs like OpenAI, and the fine-tuned models are owned by OpenAI and can be unilaterally deleted. In contrast, anyone can fine-tune one of the Posthuman Models on their own data, and the resulting model will also be community-owned.
  • Censorship-Resistant Access : Access to AI is fast becoming a basic necessity for a productive life, however such access can easily be censored by centralised providers. With a decentralised alternative, any holder of crypto is guranteed to be treated equally by the protocol.

Additional benefits include-

  • Verifiable Training and Inference: The end user can know for sure which model served a particular inference request
  • Zero-Knowledge fine-tuning: The marketplace controls the models, ensuring each person who contributed to training is rewarded fairly, as all value created by these models remains on-chain and cannot be 'leaked'.

In v2 [OCT 2021], we've now introduced two custom-trained, commercially useful models on Ocean Market (Polygon) Mainnet -

Model 1: AI Assistant as a service - A custom t-gpt2 model trained on conversational data, this model can be used to build and run conversational AI chatbots across fields, AI based games like adventure, etc.

Model 2: Wikipedia QA as a service - A custom t-roberta model pipeline trained on Open-Domain wikipedia question answering. This model can answer any question from the entirety of wikipedia text. This can be used for research across fields, such as medical, historical, academic & scientific research.

Documentation [WIP]

Model 1 - Posthuman Conversational AI v1

How to Perform Custom Inference

Edit the โ€œtextโ€ variable in the algorithm. Allow the first two lines to remain as is - they indicate the recommended input format from the paper https://github.com/mgalley/DSTC7-End-to-End-Conversation-Modeling/tree/master/evaluation & https://arxiv.org/pdf/1911.00536.pdf for AI models that build upon the Dialog-gpt2 pretrained weights/ DSTC7 dataset.

How to fine-tune Posthuman Conversational AI v1 on custom data

Edit the algo_training algorithm, specifying your dataset and the fine-tuning hyperparameters. Format the training dataset based on the recommended input format from the papers https://github.com/mgalley/DSTC7-End-to-End-Conversation-Modeling/tree/master/evaluation & https://arxiv.org/pdf/1911.00536.pdf for best results.

Codebase updates:

Code upgrades to transformers 4.1 - Weโ€™ve updated all our algorithms to be compatible with the latest version of the huggingface transformers library, which includes speed and performance upgrades enabling larger models to be run on the template hardware.

Specifically, the workflow is as follows:

  1. Alice publishes a GPT-2 model M1 using PostHuman's compute to data provider, trained on any dataset X. [tests/posthuman-legacy/Alice_flow.py]

  2. Bob buys datatokens and runs further training (finetuning) on any custom dataset Y, using the algo_training.py algorithm, to create updated model M2. [tests/posthuman-legacy/Charlie_flow.py]

  3. The updated model (M2)- i) remains on the marketplaceโ€™s machine; ii) is published as an asset on ocean iii) Bob and Alice are rewarded with datatokens of the newly trained model

  4. Charlie decides to train the model further, purchasing datatokens from Bob, creating demand. The second updated model (M3) is likewise published as an asset, and a datatoken reward issued to Charlie [tests/posthuman-legacy/Charlie_flow.py] + [algo_training.py]

  5. Derek finds M3 to be sufficiently trained for his commercial use-case. He buys access to the inference endpoints using the DataTokens in Chalie's Possession, completing the demand loop. [tests/posthuman-legacy/Bob_infer_flow.py] + [algo_inference.py]

  6. Elena is unsure if the model she is using (M3) is worth what she is paying. She runs an [algo_evaluation.py] C2D request and learns that the model sheโ€™s using does indeed have better performance on her dataset than the published SoTA. [tests/posthuman-legacy/Bob_eval_flow.py]

To get a hands-on understanding, we've developed READMEs for each of these users - check out the README folder. Furthermore, Posthuman v0.2 now includes a number of tests of the above functionality - check out the tests/posthuman folder.

Python library to privately & securely publish, exchange, and consume data.

With ocean.py, you can:

  • Publish data services: downloadable files or compute-to-data. Ocean creates a new ERC20 datatoken for each dataset / data service.
  • Mint datatokens for the service
  • Sell datatokens via an OCEAN-datatoken Balancer pool (for auto price discovery), or for a fixed price
  • Stake OCEAN on datatoken pools
  • Consume datatokens, to access the service
  • Transfer datatokens to another owner, and all other ERC20 actions using web3.py etc.

ocean.py is part of the Ocean Protocol toolset.

This is in beta state and you can expect running into problems. If you run into them, please open up a new issue.

๐Ÿ— Installation

pip install ocean-lib

๐Ÿ„ Quickstart

Simple Flow

Publish your first datatoken - connect to Ethereum, create an Ocean instance, and publish.

Learn more

Marketplace flow

Create a marketplace and sell data - batteries-included flow including using off-chain services for metadata and consuming datasets.

๐Ÿฆ‘ Development

If you want to further develop ocean.py, then please go here.

๐Ÿ› License

Copyright ((C)) 2021 Ocean Protocol Foundation

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages