# Generating Features from GeoTiff Files
From GeoTiff Files available for India over a period of more than 20 years, we want to generate features from those files for the problem of prediction of district wise crop yield in India.

Due to gdal package, had to make a separate environment using conda. So install packages for this notebook in that environment itself. Check from the anaconda prompt, the names of all the envs are available: 
```shell
$ conda info --envs 
$ activate env_name
```

In [1]:
from osgeo import ogr, osr, gdal

import fiona
from shapely.geometry import Point, shape

import numpy as np
import pandas as pd

import os
import sys
import tarfile

- For Windows
``` python
base_ = "C:\Users\deepak\Desktop\Repo\Maps\Districts\Census_2011"
```
- For macOS
``` python
base_ = "/Users/macbook/Documents/BTP/Satellite/Data/Maps/Districts/Census_2011"
```

In [2]:
# Change this for Win7,macOS
bases = "C:\Users\deepak\Desktop\Repo\Maps\Districts\Census\Dist.shp"
# base_ = "/Users/macbook/Documents/BTP/Satellite/Data/Maps/Districts/Census_2011"
fc = fiona.open(bases)

In [3]:
def reverse_geocode(pt):
    for feature in fc:
        if shape(feature['geometry']).contains(pt):
            return feature['properties']['DISTRICT']
    return "NRI"

In [4]:
# base = "/Users/macbook/Documents/BTP/Satellite/Data/Sat" # macOS
base = "G:\BTP\Satellite\Data\Test"  # Win7

In [5]:
# b = True
# for directory, subdirList, fileList in os.walk(base):
# #     if b:
# #         b = False
# #         continue
#     #print ("Directory: " + directory)
#     for filename in fileList:
#         if filename[0] != '.': print ("\t" + filename)  

In [6]:
def extract(filename, force=False):
    root = os.path.splitext(os.path.splitext(filename)[0])[0]  # remove .tar.gz
    if os.path.isdir(os.path.join(base,root)) and not force:
        # You may override by setting force=True.
        print('%s already present - Skipping extraction of %s' % (root, filename))
    else:
        print('Extracting data for %s' % root)
        tar = tarfile.open(os.path.join(base,filename))
        sys.stdout.flush()
        tar.extractall(os.path.join(base,root))
        tar.close()        

In [7]:
# extracting all the tar files ... (if not extracted)
for directory, subdirList, fileList in os.walk(base):
    for filename in fileList:
        if filename.endswith(".tar.gz"): 
            d = extract(filename)

LE07_L1TP_146039_20101223_20161211_01_T1 already present - Skipping extraction of LE07_L1TP_146039_20101223_20161211_01_T1.tar.gz
LE07_L1TP_146041_20101223_20161211_01_T1 already present - Skipping extraction of LE07_L1TP_146041_20101223_20161211_01_T1.tar.gz


In [8]:
directories = [os.path.join(base, d) for d in sorted(os.listdir(base)) if os.path.isdir(os.path.join(base, d))]
# print directories

In [9]:
ds = gdal.Open(base + "\LE07_L1TP_146039_20101223_20161211_01_T1\LE07_L1TP_146039_20101223_20161211_01_T1_B1.TIF")

Prepare one `ds` variable here itself, for the transformation of the coordinate system below.

In [10]:
# get the existing coordinate system
old_cs= osr.SpatialReference()
old_cs.ImportFromWkt(ds.GetProjectionRef())

# create the new coordinate system
wgs84_wkt = """
GEOGCS["WGS 84",
    DATUM["WGS_1984",
        SPHEROID["WGS 84",6378137,298.257223563,
            AUTHORITY["EPSG","7030"]],
        AUTHORITY["EPSG","6326"]],
    PRIMEM["Greenwich",0,
        AUTHORITY["EPSG","8901"]],
    UNIT["degree",0.01745329251994328,
        AUTHORITY["EPSG","9122"]],
    AUTHORITY["EPSG","4326"]]"""
new_cs = osr.SpatialReference()
new_cs.ImportFromWkt(wgs84_wkt)

# create a transform object to convert between coordinate systems
transform = osr.CoordinateTransformation(old_cs,new_cs) 

In [11]:
def pixel2coord(x, y, xoff, a, b, yoff, d, e):
    """Returns global coordinates from coordinates x,y of the pixel"""
    xp = a * x + b * y + xoff
    yp = d * x + e * y + yoff
    return(xp, yp)

In [12]:
ricep = pd.read_csv("C:\Users\deepak\Desktop\Repo\BTP\Ricep.csv")
ricep = ricep.drop(["Unnamed: 0"],axis=1)
ricep["value"] = ricep["Production"]/ricep["Area"]
ricep.head()

Unnamed: 0,State_Name,ind_district,Crop_Year,Season,Crop,Area,Production,phosphorus,X1,X2,X3,X4,value
0,Andhra Pradesh,anantapur,1999,kharif,Rice,37991.0,105082.0,0.0,96800.0,75400.0,643.72,881.473,2.765971
1,Andhra Pradesh,anantapur,2000,kharif,Rice,39905.0,117680.0,0.0,105082.0,96800.0,767.351,643.72,2.949004
2,Andhra Pradesh,anantapur,2001,kharif,Rice,32878.0,95609.0,0.0,117680.0,105082.0,579.338,767.351,2.907993
3,Andhra Pradesh,anantapur,2002,kharif,Rice,29066.0,66329.0,0.0,95609.0,117680.0,540.07,579.338,2.282013
4,Andhra Pradesh,anantapur,2005,kharif,Rice,25008.0,69972.0,0.0,85051.0,44891.0,819.7,564.5,2.797985


## New features
----
> 12 months (Numbered 1 to 12)
>> 10 TIF files (12 for SAT_8)
>>> Mean & Variance

In [13]:
a = np.empty((ricep.shape[0],1))*np.NAN

In [14]:
""" 'features' contain collumn indexes for the new features """
""" 'dictn' is the dictionary mapping name of collumn index to the index number """
features = []
dictn = {}
k = 13
for i in range(1,13):
    for j in range(1,11):
        s = str(i) + "_B" + str(j) + "_"
        features.append(s+"M")
        features.append(s+"V")
        dictn[s+"M"] = k
        dictn[s+"V"] = k+1
        k = k+2

In [15]:
for i in range(1,13):
    for j in range(1,11):
        s = str(i) + "_B" + str(j) + "_"
        features.append(s+"Mn")
        features.append(s+"Vn")

In [16]:
len(features)

480

In [17]:
tmp = pd.DataFrame(index=range(ricep.shape[0]),columns=features)
ricex = pd.concat([ricep,tmp], axis=1)

In [18]:
ricex.head()

Unnamed: 0,State_Name,ind_district,Crop_Year,Season,Crop,Area,Production,phosphorus,X1,X2,...,12_B6_Mn,12_B6_Vn,12_B7_Mn,12_B7_Vn,12_B8_Mn,12_B8_Vn,12_B9_Mn,12_B9_Vn,12_B10_Mn,12_B10_Vn
0,Andhra Pradesh,anantapur,1999,kharif,Rice,37991.0,105082.0,0.0,96800.0,75400.0,...,,,,,,,,,,
1,Andhra Pradesh,anantapur,2000,kharif,Rice,39905.0,117680.0,0.0,105082.0,96800.0,...,,,,,,,,,,
2,Andhra Pradesh,anantapur,2001,kharif,Rice,32878.0,95609.0,0.0,117680.0,105082.0,...,,,,,,,,,,
3,Andhra Pradesh,anantapur,2002,kharif,Rice,29066.0,66329.0,0.0,95609.0,117680.0,...,,,,,,,,,,
4,Andhra Pradesh,anantapur,2005,kharif,Rice,25008.0,69972.0,0.0,85051.0,44891.0,...,,,,,,,,,,


In [19]:
k = 10
hits = 0

In [20]:
%time

for directory in directories:
    
    """ Identifying Month, Year, Spacecraft ID """
    date = directory.split('\\')[-1].split('_')[3] # Change for Win7
    satx = directory.split('\\')[-1][3]
    month = date[4:6]
    year = date[0:4]
    
    """ Visiting every GeoTIFF file """ 
    for _,_,files in os.walk(directory):
        for filename in files:
            
            if filename.endswith(".TIF"):
                print os.path.join(directory,filename)
                
                ds = gdal.Open(os.path.join(directory,filename))
                if ds == None: continue
                col, row, _ = ds.RasterXSize, ds.RasterYSize, ds.RasterCount
                xoff, a, b, yoff, d, e = ds.GetGeoTransform()
                
                """ Now go to each pixel, find its lat,lon. Hence its district, and the pixel value """
                """ Find the row with same (Year,District), in Crop Dataset. """
                """ Find the feature using Month, Band, SATx """
                """ For this have to find Mean & Variance """
                
                for i in range(0,col,col/k):
                    for j in range(0,row,row/k):
                        
                        ########### fetching the lat and lon coordinates 
                        x,y = pixel2coord(i, j, xoff, a, b, yoff, d, e)
                        lonx, latx, z = transform.TransformPoint(x,y)
                        
                        ########### fetching the name of district
                        point = Point(lonx,latx)
                        district = reverse_geocode(point)
                        if district == "NRI": continue
                        
                        ########### The pixel value for that location
                        px,py = i,j
                        pix = ds.ReadAsArray(px,py,1,1)
                        pix = pix[0][0]
                        
                        ########### Locating the row in DataFrame which we want to update
                        district = district.lower()
                        district = district.strip()
                        r = ricex.index[(ricex['ind_district'] == district) & (ricex['Crop_Year'] == int(year))].tolist()
                        
                        if len(r) == 1:
                            """ Found the row, so now .."""
                            """ Find Collumn index corresponding to Month, Band """
                            hits = hits + 1
                            print ("Hits: ", hits)
                            ####### Band Number ########
                            band = filename.split("\\")[-1].split("_")[7:][0].split(".")[0][1]
                            bnd = band
                            if band == '6':
                                if filename.split("\\")[-1].split("_")[7:][2][0] == '1':
                                    bnd = band
                                else:
                                    bnd = '9'
                            elif band == 'Q':
                                bnd = '10'
                            
                            sm = month + "_B" + bnd +"_M"
                            
                            cm = dictn[sm]
                            
                            r = r[0]
                            # cm is the collumn indexe for mean
                            # r[0] is the row index
                            
                            ##### Checking if values are null ...
                            valm = ricex.iloc[r,cm]
                            if pd.isnull(valm): 
                                ricex.iloc[r,cm] = pix
                                ricex.iloc[r,cm+1] = pix*pix
                                ricex.iloc[r,cm+240] = 1
                                continue
                                
                            ##### if the values are not null ...
                            valv = ricex.iloc[r,cm+1]
                            n = ricex.iloc[r,cm+240]
                            n = n+1
                            
                            # Mean & Variance update
                            ricex.iloc[r,cm] = valm + (pix-valm)/n
                            ricex.iloc[r,cm+1] = ((n-2)/(n-1))*valv + (pix-valm)*(pix-valm)/n
                            ricex.iloc[r,cm+240] = n
                            
                            
                        else:
                            print ("No match for the district " + district + " for the year " + year)
                    

Wall time: 0 ns
G:\BTP\Satellite\Data\Test\LE07_L1TP_146039_20101223_20161211_01_T1\LE07_L1TP_146039_20101223_20161211_01_T1_B1.TIF
('Hits: ', 1)
('Hits: ', 2)
('Hits: ', 3)
('Hits: ', 4)
('Hits: ', 5)
('Hits: ', 6)
('Hits: ', 7)
('Hits: ', 8)
('Hits: ', 9)
('Hits: ', 10)
No match for the district panipat for the year 2010
('Hits: ', 11)
No match for the district shimla for the year 2010
No match for the district sirmaur for the year 2010
No match for the district sirmaur for the year 2010
No match for the district sirmaur for the year 2010
('Hits: ', 12)
('Hits: ', 13)
No match for the district saharanpur for the year 2010
No match for the district saharanpur for the year 2010
('Hits: ', 14)
('Hits: ', 15)
No match for the district shimla for the year 2010
No match for the district shimla for the year 2010
No match for the district shimla for the year 2010
No match for the district sirmaur for the year 2010
No match for the district sirmaur for the year 2010
('Hits: ', 16)
No match fo



No match for the district saharanpur for the year 2010
No match for the district saharanpur for the year 2010
No match for the district hardwar for the year 2010
No match for the district hardwar for the year 2010
('Hits: ', 21)
('Hits: ', 22)
No match for the district shimla for the year 2010
No match for the district uttarkashi for the year 2010
No match for the district uttarkashi for the year 2010
('Hits: ', 23)
No match for the district tehri garhwal for the year 2010
('Hits: ', 24)
('Hits: ', 25)
No match for the district hardwar for the year 2010
No match for the district hardwar for the year 2010
('Hits: ', 26)




('Hits: ', 27)
No match for the district shimla for the year 2010
No match for the district uttarkashi for the year 2010
No match for the district uttarkashi for the year 2010
No match for the district uttarkashi for the year 2010
No match for the district tehri garhwal for the year 2010
No match for the district tehri garhwal for the year 2010
('Hits: ', 28)
No match for the district garhwal for the year 2010
('Hits: ', 29)




('Hits: ', 30)
('Hits: ', 31)
('Hits: ', 32)
No match for the district uttarkashi for the year 2010
No match for the district uttarkashi for the year 2010
No match for the district uttarkashi for the year 2010
No match for the district tehri garhwal for the year 2010
No match for the district tehri garhwal for the year 2010
No match for the district tehri garhwal for the year 2010
No match for the district garhwal for the year 2010
No match for the district garhwal for the year 2010
('Hits: ', 33)
('Hits: ', 34)
('Hits: ', 35)
No match for the district uttarkashi for the year 2010
No match for the district uttarkashi for the year 2010
No match for the district tehri garhwal for the year 2010
No match for the district tehri garhwal for the year 2010
No match for the district tehri garhwal for the year 2010
No match for the district garhwal for the year 2010
No match for the district garhwal for the year 2010
No match for the district garhwal for the year 2010
No match for the district g

No match for the district shimla for the year 2010
No match for the district sirmaur for the year 2010
No match for the district sirmaur for the year 2010
('Hits: ', 96)
No match for the district saharanpur for the year 2010
No match for the district saharanpur for the year 2010
No match for the district saharanpur for the year 2010
('Hits: ', 97)
('Hits: ', 98)
No match for the district shimla for the year 2010
No match for the district shimla for the year 2010
No match for the district shimla for the year 2010
('Hits: ', 99)
('Hits: ', 100)
No match for the district saharanpur for the year 2010
No match for the district saharanpur for the year 2010
No match for the district hardwar for the year 2010
No match for the district hardwar for the year 2010
('Hits: ', 101)
('Hits: ', 102)
No match for the district shimla for the year 2010
No match for the district uttarkashi for the year 2010
No match for the district uttarkashi for the year 2010
('Hits: ', 103)
No match for the district te

No match for the district chamoli for the year 2010
No match for the district chamoli for the year 2010
No match for the district chamoli for the year 2010
No match for the district chamoli for the year 2010
No match for the district chamoli for the year 2010
No match for the district chamoli for the year 2010
No match for the district almora for the year 2010
('Hits: ', 158)
('Hits: ', 159)
No match for the district chamoli for the year 2010
No match for the district chamoli for the year 2010
No match for the district chamoli for the year 2010
No match for the district chamoli for the year 2010
No match for the district chamoli for the year 2010
No match for the district bageshwar for the year 2010
No match for the district almora for the year 2010
No match for the district almora for the year 2010
('Hits: ', 160)
G:\BTP\Satellite\Data\Test\LE07_L1TP_146039_20101223_20161211_01_T1\LE07_L1TP_146039_20101223_20161211_01_T1_B5.TIF
('Hits: ', 161)
('Hits: ', 162)
('Hits: ', 163)
('Hits: '

('Hits: ', 229)
('Hits: ', 230)
('Hits: ', 231)
('Hits: ', 232)
No match for the district uttarkashi for the year 2010
No match for the district uttarkashi for the year 2010
No match for the district uttarkashi for the year 2010
No match for the district tehri garhwal for the year 2010
No match for the district tehri garhwal for the year 2010
No match for the district tehri garhwal for the year 2010
No match for the district garhwal for the year 2010
No match for the district garhwal for the year 2010
('Hits: ', 233)
('Hits: ', 234)
('Hits: ', 235)
No match for the district uttarkashi for the year 2010
No match for the district uttarkashi for the year 2010
No match for the district tehri garhwal for the year 2010
No match for the district tehri garhwal for the year 2010
No match for the district tehri garhwal for the year 2010
No match for the district garhwal for the year 2010
No match for the district garhwal for the year 2010
No match for the district garhwal for the year 2010
No ma

No match for the district shimla for the year 2010
No match for the district shimla for the year 2010
No match for the district sirmaur for the year 2010
No match for the district sirmaur for the year 2010
('Hits: ', 296)
No match for the district saharanpur for the year 2010
No match for the district saharanpur for the year 2010
No match for the district saharanpur for the year 2010
('Hits: ', 297)
('Hits: ', 298)
No match for the district shimla for the year 2010
No match for the district shimla for the year 2010
No match for the district shimla for the year 2010
('Hits: ', 299)
('Hits: ', 300)
No match for the district saharanpur for the year 2010
No match for the district saharanpur for the year 2010
No match for the district hardwar for the year 2010
No match for the district hardwar for the year 2010
('Hits: ', 301)
('Hits: ', 302)
No match for the district shimla for the year 2010
No match for the district uttarkashi for the year 2010
No match for the district uttarkashi for the

('Hits: ', 357)
No match for the district uttarkashi for the year 2010
No match for the district chamoli for the year 2010
No match for the district chamoli for the year 2010
No match for the district chamoli for the year 2010
No match for the district chamoli for the year 2010
No match for the district chamoli for the year 2010
No match for the district chamoli for the year 2010
No match for the district almora for the year 2010
('Hits: ', 358)
('Hits: ', 359)
No match for the district chamoli for the year 2010
No match for the district chamoli for the year 2010
No match for the district chamoli for the year 2010
No match for the district chamoli for the year 2010
No match for the district chamoli for the year 2010
No match for the district bageshwar for the year 2010
No match for the district almora for the year 2010
No match for the district almora for the year 2010
('Hits: ', 360)
G:\BTP\Satellite\Data\Test\LE07_L1TP_146039_20101223_20161211_01_T1\LE07_L1TP_146039_20101223_20161211



No match for the district saharanpur for the year 2010
No match for the district saharanpur for the year 2010
No match for the district hardwar for the year 2010
No match for the district hardwar for the year 2010
('Hits: ', 381)
('Hits: ', 382)
No match for the district shimla for the year 2010
No match for the district uttarkashi for the year 2010
No match for the district uttarkashi for the year 2010
('Hits: ', 383)
No match for the district tehri garhwal for the year 2010
('Hits: ', 384)
('Hits: ', 385)
No match for the district hardwar for the year 2010
No match for the district hardwar for the year 2010
('Hits: ', 386)




('Hits: ', 387)
No match for the district shimla for the year 2010
No match for the district uttarkashi for the year 2010
No match for the district uttarkashi for the year 2010
No match for the district uttarkashi for the year 2010
No match for the district tehri garhwal for the year 2010
No match for the district tehri garhwal for the year 2010
('Hits: ', 388)
No match for the district garhwal for the year 2010
('Hits: ', 389)
('Hits: ', 390)
('Hits: ', 391)
('Hits: ', 392)
No match for the district uttarkashi for the year 2010
No match for the district uttarkashi for the year 2010
No match for the district uttarkashi for the year 2010
No match for the district tehri garhwal for the year 2010
No match for the district tehri garhwal for the year 2010
No match for the district tehri garhwal for the year 2010
No match for the district garhwal for the year 2010
No match for the district garhwal for the year 2010
('Hits: ', 393)
('Hits: ', 394)
('Hits: ', 395)
No match for the district utt

No match for the district kushinagar for the year 2010
No match for the district deoria for the year 2010
No match for the district deoria for the year 2010
No match for the district pashchim champaran for the year 2010
No match for the district pashchim champaran for the year 2010
No match for the district pashchim champaran for the year 2010
No match for the district kushinagar for the year 2010
No match for the district kushinagar for the year 2010
No match for the district gopalganj for the year 2010
No match for the district pashchim champaran for the year 2010
No match for the district pashchim champaran for the year 2010
No match for the district pashchim champaran for the year 2010
No match for the district kushinagar for the year 2010
No match for the district gopalganj for the year 2010
No match for the district pashchim champaran for the year 2010
No match for the district pashchim champaran for the year 2010
No match for the district pashchim champaran for the year 2010
No 

No match for the district basti for the year 2010
No match for the district basti for the year 2010
('Hits: ', 429)
No match for the district balrampur for the year 2010
No match for the district balrampur for the year 2010
No match for the district siddharth nagar for the year 2010
No match for the district basti for the year 2010
No match for the district basti for the year 2010
No match for the district basti for the year 2010
('Hits: ', 430)
No match for the district siddharth nagar for the year 2010
No match for the district siddharth nagar for the year 2010
No match for the district siddharth nagar for the year 2010
No match for the district basti for the year 2010
No match for the district basti for the year 2010
('Hits: ', 431)
No match for the district siddharth nagar for the year 2010
No match for the district siddharth nagar for the year 2010
No match for the district sant kabir nagar for the year 2010
No match for the district sant kabir nagar for the year 2010
No match for

No match for the district kushinagar for the year 2010
No match for the district kushinagar for the year 2010
No match for the district deoria for the year 2010
No match for the district deoria for the year 2010
No match for the district pashchim champaran for the year 2010
No match for the district pashchim champaran for the year 2010
No match for the district pashchim champaran for the year 2010
No match for the district kushinagar for the year 2010
No match for the district kushinagar for the year 2010
No match for the district gopalganj for the year 2010
No match for the district pashchim champaran for the year 2010
No match for the district pashchim champaran for the year 2010
No match for the district pashchim champaran for the year 2010
No match for the district kushinagar for the year 2010
No match for the district gopalganj for the year 2010
No match for the district pashchim champaran for the year 2010
No match for the district pashchim champaran for the year 2010
No match fo

No match for the district basti for the year 2010
No match for the district basti for the year 2010
('Hits: ', 464)
No match for the district balrampur for the year 2010
No match for the district balrampur for the year 2010
No match for the district siddharth nagar for the year 2010
No match for the district basti for the year 2010
No match for the district basti for the year 2010
No match for the district basti for the year 2010
('Hits: ', 465)
No match for the district siddharth nagar for the year 2010
No match for the district siddharth nagar for the year 2010
No match for the district siddharth nagar for the year 2010
No match for the district basti for the year 2010
No match for the district basti for the year 2010
('Hits: ', 466)
No match for the district siddharth nagar for the year 2010
No match for the district siddharth nagar for the year 2010
No match for the district sant kabir nagar for the year 2010
No match for the district sant kabir nagar for the year 2010
No match for

In [21]:
ricex.describe()

Unnamed: 0,Crop_Year,Area,Production,phosphorus,X1,X2,X3,X4,value
count,1931.0,1931.0,1931.0,1931.0,1931.0,1931.0,1931.0,1931.0,1931.0
mean,2004.201968,69936.771103,155605.2,0.557224,158862.5,156724.6,861.66757,871.507723,1.950687
std,3.631699,81619.498004,215125.0,0.665363,217685.5,217087.3,476.198957,458.603987,1.102191
min,1999.0,1.0,0.0,0.0,1.0,1.0,76.944,108.8,0.0
25%,2001.0,6392.5,6326.0,0.0,7545.5,6987.5,592.7,607.83,1.021009
50%,2005.0,41040.0,71900.0,0.0,75748.0,72048.0,765.714,773.6,1.882997
75%,2007.0,111052.0,233305.0,1.0,237264.0,228518.5,1023.702,1045.6795,2.645009
max,2010.0,545965.0,1637000.0,2.0,1710000.0,1710000.0,4755.7,4076.2,9.886125


In [22]:
ricex.to_csv("ricex_test1.csv")

In [23]:
fc

<open Collection 'C:\Users\deepak\Desktop\Repo\Maps\Districts\Census\Dist.shp:Dist', mode 'r' at 0x41ebf28L>

In [6]:
fc.schema

{'geometry': 'Polygon',
 'properties': OrderedDict([(u'DISTRICT', 'str:28'),
              (u'ST_NM', 'str:24'),
              (u'ST_CEN_CD', 'int:9'),
              (u'DT_CEN_CD', 'int:9'),
              (u'censuscode', 'float:14')])}

In [7]:
fc.crs

{'init': u'epsg:4326'}

In [8]:
len(fc)

641

In [10]:
for directory in directories:
    
    """ Identifying Month, Year, Spacecraft ID """
    date = directory.split('\\')[-1].split('_')[3] # Change for Win7
    satx = directory.split('\\')[-1][3]
    month = date[4:6]
    year = date[0:4]
    
    print "LANDSAT {},  MONTH: {}, YEAR: {}".format(satx,month,year)
    
    """ Visiting every GeoTIFF file """ 
    for _,_,files in os.walk(directory):
        for filename in files:
            if filename.endswith(".TIF"):
                print filename.split("\\")[-1].split("_")[7:]
                ds = gdal.Open(os.path.join(directory,filename))
                if ds == None: continue
                col, row, _ = ds.RasterXSize, ds.RasterYSize, ds.RasterCount
                xoff, a, b, yoff, d, e = ds.GetGeoTransform()
                print "Col: {0:6},  Row:{1:6}".format(col,row)
                
                """ Now go to each pixel, find its lat,lon. Hence its district, and the pixel value """
                """ Find the row with same (Year,District), in Crop Dataset. """
                """ Find the feature using Month, Band, SATx """
                """ For this have to find Mean & Variance """
                
        
        

LANDSAT 7,  MONTH: 12, YEAR: 2010
['B1.TIF']
Col:   8161,  Row:  7221
['B2.TIF']
Col:   8161,  Row:  7221
['B3.TIF']
Col:   8161,  Row:  7221
['B4.TIF']
Col:   8161,  Row:  7221
['B5.TIF']
Col:   8161,  Row:  7221
['B6', 'VCID', '1.TIF']
Col:   8161,  Row:  7221
['B6', 'VCID', '2.TIF']
Col:   8161,  Row:  7221
['B7.TIF']
Col:   8161,  Row:  7221
['B8.TIF']
Col:  16321,  Row: 14441
['BQA.TIF']
Col:   8161,  Row:  7221
LANDSAT 7,  MONTH: 12, YEAR: 2010
['B1.TIF']
Col:   7931,  Row:  6961
['B2.TIF']
Col:   7931,  Row:  6961
['B3.TIF']
Col:   7931,  Row:  6961
['B4.TIF']
Col:   7931,  Row:  6961
['B5.TIF']
Col:   7931,  Row:  6961
['B6', 'VCID', '1.TIF']
Col:   7931,  Row:  6961
['B6', 'VCID', '2.TIF']
Col:   7931,  Row:  6961
['B7.TIF']
Col:   7931,  Row:  6961
['B8.TIF']
Col:  15861,  Row: 13921
['BQA.TIF']
Col:   7931,  Row:  6961


So for LANDSAT 7:  col,row ~ **8000,7000**, with an exception of Band 8, with **16K,14K**

----
Pseudo Code
---
> Go to the base folder: extract every zip file, which is unextracted:
>> For each folder present here:
>>> For each tiff file (for each band):
>>>> Identify the following:
- Month, Year
- District Name
- Cloud Cover Percentage
- Sat 7 or 8 (maybe from #files in the folder!

>>>> According to SAT, meaning of bands change ...(Put them in corresponding features ...)

>>>> Traverse every 100th pixel (for sat7 every Kth)

----
- *Month, Year, Spacecraft ID* all from the **File Name** itself
- Regarding the pixel location selection:
    - Either go for **definite points** at some gap and avg the **non zero ones**
    - OR Can select the points **randomly** and avg the non zero ones only.
            