# How Big is NeuroData?
In this notebook, we'll see how to answer this question by characterizing the size of the NeuroData database.

To begin, let's collect the tokens of every public project stored in OCP:

In [1]:
import ndio.remote.OCP as OCP
oo = OCP()

In [2]:
publics = oo.get_public_tokens()

Now we have the names of every public project. To get the metadata for each project, we can use the `get_proj_info` function:

In [3]:
example_info = oo.get_proj_info(publics[0])

We can use the `imagesize` key, and the resolution of `0`, to get the native bounds of that dataset.

In [4]:
example_bounds = example_info['dataset']['imagesize']['0']

By multiplying these together, we can determine the number of voxels stored in this dataset:

In [5]:
example_size = reduce(lambda x, y: x * y, example_bounds)
print example_size

1059166617600


This is how many `byte`s it takes to store this volume.

## Bigger Numbers
So that tells us the size of *one* dataset. Now let's loop through *every* dataset, and sum them.

> <small>**Note:** This may take a while. Grab some coffee!</small>

In [9]:
total_bytes = 0
for proj in publics:
    try:
        print "Fetching metadata for {}.".format(proj)
        info = oo.get_proj_info(proj)
    except:
        print "Could not fetch metadata for {}.".format(proj)
        pass
    bounds = info['dataset']['imagesize']['0']
    total_bytes += reduce(lambda x, y: x*y, bounds)

print total_bytes

Fetching metadata for ac3.
Fetching metadata for ac4.
Fetching metadata for acardona_0111_8.
Fetching metadata for acardona_abd1_5.
Fetching metadata for bock11.
Fetching metadata for cajal_demo.
Fetching metadata for cv_ac3_membrane_2014.
Fetching metadata for cv_ac3_vesicle_2014.
Fetching metadata for cv_kasthuri11_membrane_2014.
Fetching metadata for cv_kasthuri11_vesicle_2014.
Fetching metadata for Ex10R55.
Fetching metadata for Ex12R75.
Fetching metadata for Ex12R76.
Fetching metadata for Ex13R51.
Fetching metadata for Ex14R58.
Fetching metadata for Ex2R18C1.
Fetching metadata for Ex2R18C2.
Fetching metadata for Ex3R43C1.
Fetching metadata for Ex3R43C2.
Fetching metadata for Ex3R43C3.
Fetching metadata for Ex6R15C1.
Fetching metadata for Ex6R15C2.
Fetching metadata for flycol.
Fetching metadata for flyemanno.
Fetching metadata for freeman14.
Fetching metadata for kasthuri11.
Fetching metadata for kasthuri11cc_ac3_vesicles.
Fetching metadata for kasthuri14Maine.
Fetching metadata f

If you don't feel like waiting for that to run:

As of October 31, 2015, the above returns `95835957164974` bytes.

In [13]:
bytes = 95835957164974
gb = bytes * 1e-9
tb = gb * 1e-3

print "Terabytes: {}".format(tb)

Terabytes: 95.835957165


As of October 31, 2015, OCP is storing nearly 96 TB of public data.