Skip to content
This repository has been archived by the owner on Jul 18, 2024. It is now read-only.

[DataCap Application] Kernelogic - Post Slingshot 2.8 continuation on NEXRAD dataset (2/3) #1005

Closed
kernelogic opened this issue Sep 23, 2022 · 26 comments

Comments

@kernelogic
Copy link

kernelogic commented Sep 23, 2022

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

  • Organization Name: Fei Yan - Kernelogic
  • Website / Social Media: https://singularity-browser.kernelogic.ca/ Slack: Fei Yan
  • Total amount of DataCap being requested (between 500 TiB and 5 PiB): 5 PiB
  • Weekly allocation of DataCap requested (usually between 1-100TiB): 750 TiB
  • On-chain address for first allocation: f154a4iq5mxq76avoooc5a3unchfbrjg7itkjfl6y

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

Similarly to this LDN comment https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/432#issuecomment-1204902669 from @dkkapur , as Slingshot 2.8 has ended, I'd like to continue storing the whole dataset to its completion under a new LDN, following the same rules as before.

This dataset is 2PB+, I have got one 5PB so this one I am requesting 15PB so that I can have approximately 10 replicas in total.

I have participated every Slingshot phase and is probably the best performing as a "small individual client". 

I have successfully completed a few LDNs on other datasets and I have record to show I have been following the rules of decentralization and have zero self dealing.

https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/60
https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/59
https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/46
https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/297
https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/298
https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/304

What is the primary source of funding for this project?

Self-funded, BigD exchange.

What other projects/ecosystem stakeholders is this project associated with?

enterprise-sp-wg, BigD exchange.

Use-case details

Describe the data being stored onto Filecoin

Real-time and archival data from the Next Generation Weather Radar (NEXRAD) network.

Where was the data in this dataset sourced from?

https://registry.opendata.aws/noaa-nexrad/

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

The data is primarily compressed binary data. Below site demonstrate how to consume and render the data
https://nbviewer.org/gist/dopplershift/356f2e14832e9b676207

s3://noaa-nexrad-level2/2021/01/01/TSDF/TSDF20210101_235417_V08

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

AWS open dataset

What is the expected retrieval frequency for this data?

Infrequent. However all details are available at my browser https://slingshot.kernelogic.ca/nexrad.html?v=2.8

For how long do you plan to keep this dataset stored on Filecoin?

Between 365 - 520 days.

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

All regions.

How will you be distributing your data to storage providers? Is there an offline data transfer process?

I will upload my prepared CAR files to a web server and coordinate with providers to download and propose offline deals.

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

Beside the previous SPs I have worked with, I also utilize bigD exchange to further decentralize the storage

To name a few from the community that I deal with regularly: PIKNIK, Holon, CabrinaHuang, HarryM, BigBear, j1v, XinAn Xu, WillTechMusing.

From BigD exchange: Mog Li, Devin Chen, DSS Nathanial Marsh, Rabinovitch, Vin K, arockpool Tony

How will you be distributing deals across storage providers?

Evenly across all providers I propose to, if they can handle. If a miner is a notary itself, this notary will receive no more than 20% of the total granted datacap.

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

I have all I need to start making deals.
@large-datacap-requests
Copy link

Thanks for your request!
Everything looks good. 👌

A Governance Team member will review the information provided and contact you back pretty soon.

@large-datacap-requests
Copy link

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!

@large-datacap-requests
Copy link

Thanks for your request!
Everything looks good. 👌

A Governance Team member will review the information provided and contact you back pretty soon.

@raghavrmadya
Copy link
Collaborator

Datacap Request Trigger

Total DataCap requested

5PiB

Expected weekly DataCap usage rate

750TiB

Client address

f1yy7riqoc3vm7jv6nawupnytj4m6sajfuq7kqn6q

@large-datacap-requests
Copy link

DataCap Allocation requested

Multisig Notary address

f01858410

Client address

f154a4iq5mxq76avoooc5a3unchfbrjg7itkjfl6y

DataCap allocation requested

256TiB

Copy link

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacedsmaqgnnyppxfinare2toyhbeqjkmw4gb2y546yy76a4ug4pl2bc

Address

f154a4iq5mxq76avoooc5a3unchfbrjg7itkjfl6y

Datacap Allocated

256.00TiB

Signer Address

f122qmy25wdtt5mxd77kndiq7z5x2n3iwiuz2wdsa

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedsmaqgnnyppxfinare2toyhbeqjkmw4gb2y546yy76a4ug4pl2bc

Copy link

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzaced2i7po66qk732esiruvsm3z3iayr7stm5vad6eckyor27fv7qx6u

Address

f154a4iq5mxq76avoooc5a3unchfbrjg7itkjfl6y

Datacap Allocated

256.00TiB

Signer Address

f1krmypm4uoxxf3g7okrwtrahlmpcph3y7rbqqgfa

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaced2i7po66qk732esiruvsm3z3iayr7stm5vad6eckyor27fv7qx6u

Copy link

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacecilfu3lidyxfus26odsruibawo7wesxcnyjghq5s4p7sba4bxtkg

Address

f154a4iq5mxq76avoooc5a3unchfbrjg7itkjfl6y

Datacap Allocated

2.00PiB

Signer Address

f1pszcrsciyixyuxxukkvtazcokexbn54amf7gvoq

Id

9fcf4770-4137-49f3-a166-8f76a973e5e5

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecilfu3lidyxfus26odsruibawo7wesxcnyjghq5s4p7sba4bxtkg

Copy link

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzaceagccbriwhnpm5owmwrem44i22p5mny7pjcjadnemg3pwlk2scaoe

Address

f154a4iq5mxq76avoooc5a3unchfbrjg7itkjfl6y

Datacap Allocated

2.00PiB

Signer Address

f1krmypm4uoxxf3g7okrwtrahlmpcph3y7rbqqgfa

Id

9fcf4770-4137-49f3-a166-8f76a973e5e5

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceagccbriwhnpm5owmwrem44i22p5mny7pjcjadnemg3pwlk2scaoe

@filplus-checker
Copy link

DataCap and CID Checker Report1

  • Organization: Fei Yan - Kernelogic
  • Client: f154a4iq5mxq76avoooc5a3unchfbrjg7itkjfl6y

Storage Provider Distribution

The below table shows the distribution of storage providers that have stored data for this client.

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

  • Storage provider should not exceed 25% of total datacap.
  • Storage provider should not be storing duplicate data for more than 20%.
  • Storage provider should have published its public IP address.
  • All storage providers should be located in different regions.

✔️ Storage provider distribution looks healthy.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01923786 Hong Kong, Central and Western, HK 126.53 TiB 6.57% 126.16 TiB 0.30%
f01909705 Kuala Lumpur, Kuala Lumpur, MY 125.88 TiB 6.53% 125.88 TiB 0.00%
f01918046 Kuala Lumpur, Kuala Lumpur, MY 125.84 TiB 6.53% 125.84 TiB 0.00%
f01918045 Kuala Lumpur, Kuala Lumpur, MY 125.81 TiB 6.53% 125.81 TiB 0.00%
f01859603 Shenzhen, Guangdong, CN 108.06 TiB 5.61% 107.81 TiB 0.23%
f01938671 Hong Kong, Central and Western, HK 96.84 TiB 5.03% 96.47 TiB 0.39%
f0143858 Clifton, New Jersey, US 87.19 TiB 4.53% 87.19 TiB 0.00%
f01938674 Shenzhen, Guangdong, CN 85.69 TiB 4.45% 85.25 TiB 0.51%
f03223 San Jose, California, US 84.91 TiB 4.41% 84.91 TiB 0.00%
f02301 San Jose, California, US 81.25 TiB 4.22% 81.25 TiB 0.00%
f0240185 Clifton, New Jersey, US 77.91 TiB 4.04% 77.91 TiB 0.00%
f01923787 Shenzhen, Guangdong, CN 75.28 TiB 3.91% 75.00 TiB 0.37%
f01660795 Shenzhen, Guangdong, CN 73.47 TiB 3.81% 73.31 TiB 0.21%
f01907578new Fuzhou, Fujian, CN 68.75 TiB 3.57% 68.75 TiB 0.00%
f01915033 Chengdu, Sichuan, CN 64.38 TiB 3.34% 64.38 TiB 0.00%
f01949260 Shanghai, Shanghai, CN 62.00 TiB 3.22% 62.00 TiB 0.00%
f01927554 Shenzhen, Guangdong, CN 61.06 TiB 3.17% 60.94 TiB 0.20%
f01949267 Shanghai, Shanghai, CN 55.91 TiB 2.90% 55.91 TiB 0.00%
f01518369 San Jose, California, US 46.75 TiB 2.43% 46.75 TiB 0.00%
f01889668 San Jose, California, US 46.75 TiB 2.43% 46.75 TiB 0.00%
f01946104 Chengdu, Sichuan, CN 37.50 TiB 1.95% 37.50 TiB 0.00%
f0142637 Chengdu, Sichuan, CN 37.50 TiB 1.95% 37.50 TiB 0.00%
f01924827 Hong Kong, Central and Western, HK 32.88 TiB 1.71% 32.88 TiB 0.00%
f01904630 Las Vegas, Nevada, US 20.91 TiB 1.09% 20.91 TiB 0.00%
f01985775 Dallas, Texas, US 18.75 TiB 0.97% 18.75 TiB 0.00%
f033462 Dallas, Texas, US 18.75 TiB 0.97% 18.75 TiB 0.00%
f01943663 Hong Kong, Central and Western, HK 18.63 TiB 0.97% 18.63 TiB 0.00%
f01985745 Dallas, Texas, US 18.41 TiB 0.96% 18.41 TiB 0.00%
f0397083 Tokyo, Tokyo, JP 18.00 TiB 0.93% 18.00 TiB 0.00%
f01240218 Chengdu, Sichuan, CN 9.38 TiB 0.49% 9.38 TiB 0.00%
f01873432 Las Vegas, Nevada, US 7.06 TiB 0.37% 7.06 TiB 0.00%
f0870354 Beijing, Beijing, CN 4.84 TiB 0.25% 4.84 TiB 0.00%
f01821041 Vancouver, British Columbia, CA 3.44 TiB 0.18% 3.44 TiB 0.00%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

  • No more than 25% of unique data are stored with less than 4 providers.

⚠️ 33.57% of deals are for data replicated across less than 4 storage providers.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
103.59 TiB 103.59 TiB 1 5.38%
48.31 TiB 96.63 TiB 2 5.02%
148.78 TiB 446.47 TiB 3 23.18%
88.16 TiB 352.69 TiB 4 18.31%
59.56 TiB 297.81 TiB 5 15.46%
30.78 TiB 184.69 TiB 6 9.59%
45.47 TiB 320.09 TiB 7 16.62%
12.94 TiB 103.50 TiB 8 5.37%
2.31 TiB 20.81 TiB 9 1.08%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients.
Usually different applications owns different data and should not resolve to the same CID.

⚠️ CID sharing has been observed.

Other Client Application Total Deals Affected Unique CIDs Verifier
f1tszopsyo4fs75f4ia5h25btfob7t6upmj2w27qq Fei Yan - Kernelogic 1.17 PiB 11,128 LDN v3 multisig
f1yy7riqoc3vm7jv6nawupnytj4m6sajfuq7kqn6q Fei Yan - Kernelogic 1.16 PiB 11,273 LDN v3 multisig
f1ioosn6kwao6q34lxs4twrycjqeeir4pv3qh5cci Fei Yan - Kernelogic 577.75 TiB 1,056 LDN v3 multisig

Footnotes

  1. To manually trigger this report, add a comment with text checker:manualTrigger

@kernelogic
Copy link
Author

keepalive

@C00kies77
Copy link

checker:manualTrigger

@filplus-checker-app
Copy link

DataCap and CID Checker Report Summary1

Storage Provider Distribution

⚠️ 1 storage providers sealed too much duplicate data - f01851060: 20.11%

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients2

⚠️ CID sharing has been observed. (Top 3)

Full report

Click here to view the full report.

Footnotes

  1. To manually trigger this report, add a comment with text checker:manualTrigger

  2. To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Copy link

Thanks for your request!
❗ We have found some problems in the information provided.
We could not find Website / Social Media field in the information provided
We could not find Total amount of DataCap being requested (between 500 TiB and 5 PiB) field in the information provided
We could not find Weekly allocation of DataCap requested (usually between 1-100TiB) field in the information provided
We could not find On-chain address for first allocation field in the information provided
We could not find Data Type of Application field in the information provided

Please, take a look at the request and edit the body of the issue providing all the required information.

Copy link

RootKeyHolders have approved multisig account. You can now request first datacap release

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

14 participants