# Cavalieri estimation of disector volumes
This [Jupyter](http://jupyter.com) notebook was used to see which sample we still have to do the Cavalieri estimation for the end result of the [publication on the acinar complexity](https://www.authorea.com/274247/47HwqAxume3L2xkLOsg_SQ).

In [1]:
#Load the data and set up notebook
import platform
import glob
import os
import pandas

In [2]:
# We copied everything from nas_schittny and the terastation to 'fast SSD'.
# Load the data from there
if 'debian' in platform.dist():
    drive = '/media/habi/Fast_SSD/'
else:
    drive = 'F:\\'
# Load the data from this folder
RootPath = drive + os.path.join('Acini')
print('We are loading all the data from %s' % RootPath)

We are loading all the data from /media/habi/Fast_SSD/Acini


In [3]:
# Get a list of all the STEPanizer export files from Eveline
# Based on https://stackoverflow.com/a/14798263
StepanizerFiles_Eveline = sorted(glob.glob(os.path.join(RootPath, '**/*201[1234567]*.xls'), recursive=True))

In [4]:
print('Eveline counted the alveoli in %s acini' % len(StepanizerFiles_Eveline))

Eveline counted the alveoli in 287 acini


In [5]:
Eveline = pandas.DataFrame({'Location': StepanizerFiles_Eveline})
Eveline['Filename'] = [os.path.basename(f) for f in StepanizerFiles_Eveline]
Eveline['Beamtime'] = [os.path.dirname(f).split('Acini')[1].split(os.sep)[1] for f in StepanizerFiles_Eveline]
Eveline['Sample'] = [os.path.basename(f).split('-acinus')[0][1:] for f in StepanizerFiles_Eveline]
Eveline['Animal'] = [os.path.basename(f).split('_R108C')[1].split('mrg-')[0][:3] for f in StepanizerFiles_Eveline]
Eveline['Day'] = [int(os.path.basename(f).split('_R108C')[1].split('mrg-')[0][:2]) for f in StepanizerFiles_Eveline]
Eveline['Acinus'] = [int(os.path.basename(f).split('acinus')[1].split('_')[0]) for f in StepanizerFiles_Eveline]

In [6]:
# Get a list of *all* the excel files I counted are from the STEPanizer
# Differing to the ones from Eveline, we only have '2018' in the file name...
StepanizerFiles_David = sorted(glob.glob(os.path.join(RootPath, '**/*2018*.xls'), recursive=True))

In [7]:
print('David assessed the disector volume in %s acini' % len(StepanizerFiles_David))

David assessed the disector volume in 287 acini


In [8]:
David = pandas.DataFrame({'Location': StepanizerFiles_David})
David['Filename'] = [os.path.basename(f) for f in StepanizerFiles_David]
David['Beamtime'] = [os.path.dirname(f).split('Acini')[1].split(os.sep)[1] for f in StepanizerFiles_David]
David['Sample'] = [os.path.basename(f).split('-acinus')[0][1:] for f in StepanizerFiles_David]
David['Animal'] = [os.path.basename(f).split('_R108C')[1].split('mrg-')[0][:3] for f in StepanizerFiles_David]
David['Day'] = [int(os.path.basename(f).split('_R108C')[1].split('mrg-')[0][:2]) for f in StepanizerFiles_David]
David['Acinus'] = [int(os.path.basename(f).split('acinus')[1].split('_')[0]) for f in StepanizerFiles_David]

In [9]:
# Merge 'Eveline' and 'David' so we know what is still to do
# Based on https://stackoverflow.com/a/33350050/323100
StillToDo = pandas.merge(Eveline, David,
                         on=['Animal', 'Acinus', 'Day', 'Beamtime', 'Sample'],
                         how='outer', suffixes=['_Eveline', '_David'],
                         indicator=True)
StillToDo = StillToDo[StillToDo._merge != 'both']
print('We still need to assess the disector volume in %s acini...' % len(StillToDo))

We still need to assess the disector volume in 0 acini...


In [10]:
# Merge 'Eveline' and 'David' so have the one we already did
# Based on https://stackoverflow.com/a/33350050/323100
Done = pandas.merge(Eveline, David,
                         on=['Animal', 'Acinus', 'Day', 'Beamtime', 'Sample'],
                         how='inner', suffixes=['_Eveline', '_David'],
                         indicator=True)
print('We have the data of %s acini...' % len(Done))

We have the data of 287 acini...


In [11]:
# Get last image file
StillToDo['LastFile'] = [os.path.basename(sorted(glob.glob(os.path.join(os.path.dirname(location),
                                                                         '*_??_b.jpg')))[-1]) for
                          location in StillToDo.Location_Eveline]

In [12]:
# See if we have more than 99 images...
StillToDo['LastImage'] = [[int(os.path.basename(i).split('_')[-2]) for i in glob.glob(os.path.join(os.path.dirname(location),
                                          '*.jpg'))] for
                          location in StillToDo.Location_Eveline]
StillToDo['LastImage'] = [max(li) for li in StillToDo['LastImage']]

In [13]:
# Print the 'Still to do' counts in random order.
# Use this order to assess the disector volume
print('From the %s acini still to count, here are some, randomly selected' % len(StillToDo))
try:
    StillToDo.sample(n=len(StillToDo))[['Beamtime', 'Sample', 'Acinus', 'LastImage', 'LastFile']]
except ValueError:
    print('We are all done!')

From the 0 acini still to count, here are some, randomly selected
We are all done!
