# Count and Label DNA strands

This is a proof of concept (POC) developed using **Python3**, **NumPy** and **OpenCV**, to count and label the number of _DNA strands_ from an image of _combed DNA molecules_.

The technique of **Adaptive Mean Thresholding** is used here.

In [None]:
import numpy as np
import cv2

In [None]:
fileName = "12da1dc4adab6e00952790c8e363a9a62e945fd3"
extn = ".jpg"
# Read and convert the image to Grayscale
src = cv2.imread(fileName + extn, cv2.IMREAD_GRAYSCALE)

Sample DNA Image
![Sample DNA Image](12da1dc4adab6e00952790c8e363a9a62e945fd3.jpg)

In [4]:
cv2.startWindowThread()
# See what you have got
cv2.imshow("Original Image", src)

In [5]:
# Initialize parameters
height, width = src.shape
blockSize = 5
C = -7

In [6]:
# Apply Adaptive Mean Thresholding
amtImage = cv2.adaptiveThreshold(src, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, blockSize, C)
cv2.imwrite(fileName + '-after-thresholding' + extn, amtImage)
cv2.imshow("After Thresholding", amtImage)

# Find image contours
_, contours, hierarchy = cv2.findContours(amtImage, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)

# Initialize numpy array, will be used to export the image with labels
labeledImage = np.zeros((height, width, 1), np.uint8)
color = (255, 255, 255)
scale = 1
count = 0

Image after thresholding
![Image after thresholding](12da1dc4adab6e00952790c8e363a9a62e945fd3-after-thresholding.jpg)

**Label** and **Length** values will be printed in _CSV_ format on the console window. Alternatively, these can be written to a file.

In [7]:
# Header
print('number,length')

for contour in contours:
    arcLen = cv2.arcLength(contour, False)
    
    # Ignore small contours; will help to reduce noise
    if arcLen > 20:
        cv2.drawContours(labeledImage, [contour * scale], -1, color, -2)
        count += 1
        countLabel = str(count)
        x = contour[0][0][0]
        y = contour[0][0][1]
        
        # Print label on image
        cv2.putText(labeledImage, countLabel, (x, y), cv2.FONT_HERSHEY_SIMPLEX, 0.3, 180, 1, True)
        print(countLabel + ',' + str(arcLen))

number,length
1,138.23758959770203
2,61.25483310222626
3,241.4335446357727
4,68.35533821582794
5,33.21320295333862
6,73.98275470733643
7,41.627416491508484
8,22.14213538169861
9,44.52691173553467
10,25.727921843528748
11,37.041630029678345
12,24.242640614509583
13,30.38477599620819
14,25.899494767189026
15,25.727921843528748
16,75.91168713569641
17,227.63455522060394
18,33.45584309101105
19,323.102591753006
20,33.79898941516876
21,331.119837641716
22,162.2375875711441
23,40.041630268096924
24,386.52899849414825
25,152.7228683233261
26,54.84061932563782
27,38.455843806266785
28,102.91168713569641


In [None]:
# Write to file and see the output
cv2.imwrite(fileName + '-with-labels' + extn, labeledImage)
cv2.imshow("Final Image", labeledImage)

# Cleanup
cv2.waitKey(0)
cv2.destroyAllWindows()

Labeled DNA Image
![Labeled DNA Image](12da1dc4adab6e00952790c8e363a9a62e945fd3-with-labels.jpg)

### Next steps:  
* Length values are in pixels. They need to be converted to the scientific length unit.