## Review of NDVI workflow

Below we will review the workflow to calculate a difference NDVI from two dates (e.g. pre and post fire event).  

In [1]:
# Import necessary packages
import os
from glob import glob

import numpy as np
import matplotlib.pyplot as plt
from shapely.geometry import box
import geopandas as gpd
import rioxarray as rxr
from rasterio.plot import plotting_extent
#from rasterio.mask import mask
import earthpy as et
import earthpy.spatial as es
import earthpy.plot as ep

# Download data
data2 = et.data.get_data("ndvi-automation")

# # Get data and set working directory
os.chdir(os.path.join(et.io.HOME,
                      "earth-analytics",
                      "data"))


## Review of os and glob

The section below provides a review of `glob` and `os`, plus includes some new functionality in `os` that you have not learned to parse file names.

Using `glob` to create lists and `os` to parse file names are handy tasks when you are trying to automate workflows!

## Create Directories that Work Across Operating Systems - os.path.join

When you are working across different computers and platforms, it is useful to create paths that can be recognized by the Windows, Mac and Linux operating systems. The `join()` function from the `os.path` module creates a path in the format that the operating system upon which the code is being run (i.e. whatever your computer is running) requires.

This saves you the time of creating and fixing paths as you work on different machines. This approach becomes very useful when you need to move your workflow from say your laptop to a cloud or HPC environment. 

`os.path.join` takes as many strings are you provide in. It reads each string as a directory name and then creates an output path.

`os.path.join("dir1", "dir2", "dir3")`

IMPORTANT: you can create bad paths this way! This function does not actually test to ensure the path exists!

In [2]:
# Create a path
path = os.path.join("ndvi-automation", "sites")
path

'ndvi-automation/sites'

In [3]:
# Does the path exist?
os.path.exists(path)

True

In [4]:
# This path doesn't exist
path2 = os.path.join("Data", "NDVI-automation", "Sites")
os.path.exists(path2)

False

## Get Lists of Files Using glob and path.join

In a workflow where you are processing many files and directories, you can use `glob` with `path.join` to create a path and get a list of files in that path. 

By default, `glob()` returns only the files within that directory. 

In [5]:
# There are no individual files within the sites directory on this machine
path = os.path.join("ndvi-automation", 
                    "sites")
glob(path)

['ndvi-automation/sites']

You can add the syntax `*/` to tell glob to provide a list of directories rather than files. 
This will be useful when you try to get a list of subdirectories within a parent
directory - in this case there is one subdirectory for each NEON field site.

In [6]:
# Add a trailing slash to force listing of directories in a path
another_path = os.path.join("ndvi-automation", 
                            "sites")

# Get each subdirectory path using glob
all_sites = glob(os.path.join(another_path, "*/"))
all_sites

['ndvi-automation/sites/SJER/', 'ndvi-automation/sites/HARV/']

You can nest the above steps into one step as well.

In [7]:
# This single line of code below does  the same thing  as the cell
# above which is divided into several lines 
glob(os.path.join("ndvi-automation", "sites", "*/"))

['ndvi-automation/sites/SJER/', 'ndvi-automation/sites/HARV/']

Once you have a list of directories, you can loop through each directory 
and do something with the  data within that directory.

In [8]:
# Print out all site directories
for site_files in all_sites:
    print(site_files)

ndvi-automation/sites/SJER/
ndvi-automation/sites/HARV/


There are several  ways  to  produce lists of all directories within the **landsat_crop**
dir of each site subdirectory. The most efficient is to use glob as shown below.

## Use  Glob and the  * Symbol

You can use the `*` syntax in `glob` to customize the list of folders returned. 
Remember, anywhere in a file path you want to be variable you can replace with a `*`. 

Seeing as this is the case, we can get all of the folders within the `landsat-crop` folders by specifying the middle folder, as shown below. Notice how it finds everything within the `landsat-crop` folder in both the HARV and SJER folders.  

In [10]:
glob(os.path.join("ndvi-automation", "sites", "*", "landsat-crop", "*/"))

['ndvi-automation/sites/SJER/landsat-crop/LC080420342017090401T1-SC20181023162756/',
 'ndvi-automation/sites/SJER/landsat-crop/LC080420342017081901T1-SC20181023153141/',
 'ndvi-automation/sites/SJER/landsat-crop/LC080420342017102201T1-SC20181023153638/',
 'ndvi-automation/sites/SJER/landsat-crop/LC080420342017110701T1-SC20181023170129/',
 'ndvi-automation/sites/SJER/landsat-crop/LC080420342017070201T1-SC20181023153031/',
 'ndvi-automation/sites/SJER/landsat-crop/LC080420342017010701T2-SC20181023153321/',
 'ndvi-automation/sites/SJER/landsat-crop/LC080420342017092001T1-SC20181023170143/',
 'ndvi-automation/sites/SJER/landsat-crop/LC080420342017051501T1-SC20181023151959/',
 'ndvi-automation/sites/SJER/landsat-crop/LC080420342017042901T1-SC20181023153144/',
 'ndvi-automation/sites/SJER/landsat-crop/LC080420342017112301T1-SC20181023170128/',
 'ndvi-automation/sites/SJER/landsat-crop/LC080420342017061601T1-SC20181023152417/',
 'ndvi-automation/sites/SJER/landsat-crop/LC080420342017031201T1-

### How the  * Operator  Works

This way works with glob well, but there's another way to get this list using `glob`! 

By forcing only listing directories with a trailing /, we can make `glob` return this same list of direcotries without specifying the `landsat-crop` folder. 

This only works because none of the other directories within the `HARV` and `SJER` directories contain more directories, they all store individual files. 

In [30]:
glob(os.path.join("ndvi-automation", "sites", "*", "*", "*/"))

['ndvi-automation/sites/SJER/landsat-crop/LC080420342017090401T1-SC20181023162756/',
 'ndvi-automation/sites/SJER/landsat-crop/LC080420342017081901T1-SC20181023153141/',
 'ndvi-automation/sites/SJER/landsat-crop/LC080420342017102201T1-SC20181023153638/',
 'ndvi-automation/sites/SJER/landsat-crop/LC080420342017110701T1-SC20181023170129/',
 'ndvi-automation/sites/SJER/landsat-crop/LC080420342017070201T1-SC20181023153031/',
 'ndvi-automation/sites/SJER/landsat-crop/LC080420342017010701T2-SC20181023153321/',
 'ndvi-automation/sites/SJER/landsat-crop/LC080420342017092001T1-SC20181023170143/',
 'ndvi-automation/sites/SJER/landsat-crop/LC080420342017051501T1-SC20181023151959/',
 'ndvi-automation/sites/SJER/landsat-crop/LC080420342017042901T1-SC20181023153144/',
 'ndvi-automation/sites/SJER/landsat-crop/LC080420342017112301T1-SC20181023170128/',
 'ndvi-automation/sites/SJER/landsat-crop/LC080420342017061601T1-SC20181023152417/',
 'ndvi-automation/sites/SJER/landsat-crop/LC080420342017031201T1-

###  Loops

This is a demonstation of using loops to begin to parse through 
each site directory. Note that here you are using nested  
loops. Print statements can be useful as checks to see 
what your loop is doing.

In [16]:
# Define the directory name
landsat_dir = "landsat-crop"

# Loop through each site directory
for site_dir in all_sites:
    print("I am looping through", site_files)

    # Get a list of subdirectories for that site
    all_dirs = glob(os.path.join(site_dir, 'landsat-crop', '*'))
    #  Loop through  each subdirectory where your data are stored
    for adir in all_dirs:
        print("now processing ", adir)
    

I am looping through ndvi-automation/sites/HARV/
now processing  ndvi-automation/sites/SJER/landsat-crop/LC080420342017090401T1-SC20181023162756
now processing  ndvi-automation/sites/SJER/landsat-crop/LC080420342017081901T1-SC20181023153141
now processing  ndvi-automation/sites/SJER/landsat-crop/LC080420342017102201T1-SC20181023153638
now processing  ndvi-automation/sites/SJER/landsat-crop/LC080420342017110701T1-SC20181023170129
now processing  ndvi-automation/sites/SJER/landsat-crop/LC080420342017070201T1-SC20181023153031
now processing  ndvi-automation/sites/SJER/landsat-crop/LC080420342017010701T2-SC20181023153321
now processing  ndvi-automation/sites/SJER/landsat-crop/LC080420342017092001T1-SC20181023170143
now processing  ndvi-automation/sites/SJER/landsat-crop/LC080420342017051501T1-SC20181023151959
now processing  ndvi-automation/sites/SJER/landsat-crop/LC080420342017042901T1-SC20181023153144
now processing  ndvi-automation/sites/SJER/landsat-crop/LC080420342017112301T1-SC201810

### Sorting `glob` Lists

Notice that these lists aren't sorted. If it's important for a list to be in a certain order (imagery bands, for example should be in the correct order) than make sure to sort the list after glob gives it to you.

For example, if two items have identical path names, but one ends in `10` and the other ends in `1`, sometimes the file ending in `10` will be put above the file ending in `1`. Always double check the order in  
which your data are being processed!

In [18]:
# Sort the list that glob returns
sorted(glob(os.path.join('ndvi-automation', 
                         'sites', 
                         'HARV',
                         'landsat-crop', 
                         'LC080130302017072301T1-SC20181023152048', 
                         '*band*')))

['ndvi-automation/sites/HARV/landsat-crop/LC080130302017072301T1-SC20181023152048/LC08_L1TP_013030_20170723_20180125_01_T1_sr_band1.tif',
 'ndvi-automation/sites/HARV/landsat-crop/LC080130302017072301T1-SC20181023152048/LC08_L1TP_013030_20170723_20180125_01_T1_sr_band2.tif',
 'ndvi-automation/sites/HARV/landsat-crop/LC080130302017072301T1-SC20181023152048/LC08_L1TP_013030_20170723_20180125_01_T1_sr_band3.tif',
 'ndvi-automation/sites/HARV/landsat-crop/LC080130302017072301T1-SC20181023152048/LC08_L1TP_013030_20170723_20180125_01_T1_sr_band4.tif',
 'ndvi-automation/sites/HARV/landsat-crop/LC080130302017072301T1-SC20181023152048/LC08_L1TP_013030_20170723_20180125_01_T1_sr_band5.tif']

### Why Sort `glob` Lists?

The way that `glob` returns files from a folder can vary drastically. Depending on the operating system being used, or the way the files are stored, different people may get results from a `glob` list in different orders. This can lead to data errors when running projects across computers. Below shows how sorting a `glob` list changes what files you access when getting an index from the list. Notice how the same index (4) returns two different files. 

In [22]:
# Indexes can change once a list is sorted
# While some operating systems return the data sorted already, others do not
unsorted_list = glob(os.path.join('ndvi-automation', 
                                  'sites', 
                                  'HARV',
                                  'landsat-crop', 
                                  'LC080130302017072301T1-SC20181023152048', 
                                  '*band*'))

sorted_list = sorted(glob(os.path.join('ndvi-automation', 
                                       'sites', 
                                       'HARV',
                                       'landsat-crop', 
                                       'LC080130302017072301T1-SC20181023152048', 
                                       '*band*')))
unsorted_list[0], sorted_list[0]

('ndvi-automation/sites/HARV/landsat-crop/LC080130302017072301T1-SC20181023152048/LC08_L1TP_013030_20170723_20180125_01_T1_sr_band2.tif',
 'ndvi-automation/sites/HARV/landsat-crop/LC080130302017072301T1-SC20181023152048/LC08_L1TP_013030_20170723_20180125_01_T1_sr_band1.tif')

### Using Ranges To Select Sets  of  Bands

In addition to using `*` to specify which parts of a file name are important to you, 
you can use `[]` to specify a range of characters to search for. This range is 
for characters only, not strings. You can search for numbers 2-7 with `[2-7]` but 
you would not be able to search for number `[2-14]` as `14` is a string, not a 
character. 

This is not just limited to numbers. `[d-q]` would also filter results for 
characters between the letters `d` and `q`. 

In [23]:
# Get bands  1-3 - notice  the  order the bands  are returned?
glob(os.path.join('ndvi-automation', 
                  'sites', 
                  'HARV',
                  'landsat-crop', 
                  'LC080130302017072301T1-SC20181023152048', 
                  '*band[1-3]*'))

['ndvi-automation/sites/HARV/landsat-crop/LC080130302017072301T1-SC20181023152048/LC08_L1TP_013030_20170723_20180125_01_T1_sr_band2.tif',
 'ndvi-automation/sites/HARV/landsat-crop/LC080130302017072301T1-SC20181023152048/LC08_L1TP_013030_20170723_20180125_01_T1_sr_band3.tif',
 'ndvi-automation/sites/HARV/landsat-crop/LC080130302017072301T1-SC20181023152048/LC08_L1TP_013030_20170723_20180125_01_T1_sr_band1.tif']

In [24]:
# Get bands  1-3 - notice  the  order the bands are returned?
sorted(glob(os.path.join('ndvi-automation', 
                  'sites', 
                  'HARV',
                  'landsat-crop', 
                  'LC080130302017072301T1-SC20181023152048', 
                  '*band[1-3]*')))

['ndvi-automation/sites/HARV/landsat-crop/LC080130302017072301T1-SC20181023152048/LC08_L1TP_013030_20170723_20180125_01_T1_sr_band1.tif',
 'ndvi-automation/sites/HARV/landsat-crop/LC080130302017072301T1-SC20181023152048/LC08_L1TP_013030_20170723_20180125_01_T1_sr_band2.tif',
 'ndvi-automation/sites/HARV/landsat-crop/LC080130302017072301T1-SC20181023152048/LC08_L1TP_013030_20170723_20180125_01_T1_sr_band3.tif']

### `?` Operator

Similar to the `*` operator, the `?` operator is the same idea, but for a single character. 

If one character in the file name can be variable, but everything else must stay the same, than `?` is a good way to just replace that one character. 

`?` is not limited to one use per search, and can be used to replace more than one character in a query. 

In [28]:
# ? operator
glob(os.path.join('ndvi-automation', 
                  'sites', 
                  'HARV',
                  'landsat-crop', 
                  'LC080130302017072301T1-SC20181023152048', 
                  '*band?.tif'))

['ndvi-automation/sites/HARV/landsat-crop/LC080130302017072301T1-SC20181023152048/LC08_L1TP_013030_20170723_20180125_01_T1_sr_band2.tif',
 'ndvi-automation/sites/HARV/landsat-crop/LC080130302017072301T1-SC20181023152048/LC08_L1TP_013030_20170723_20180125_01_T1_sr_band3.tif',
 'ndvi-automation/sites/HARV/landsat-crop/LC080130302017072301T1-SC20181023152048/LC08_L1TP_013030_20170723_20180125_01_T1_sr_band1.tif',
 'ndvi-automation/sites/HARV/landsat-crop/LC080130302017072301T1-SC20181023152048/LC08_L1TP_013030_20170723_20180125_01_T1_sr_band4.tif',
 'ndvi-automation/sites/HARV/landsat-crop/LC080130302017072301T1-SC20181023152048/LC08_L1TP_013030_20170723_20180125_01_T1_sr_band5.tif']

In [29]:
# Multiple ? operators
glob(os.path.join('ndvi-automation', 
                  'sites', 
                  'HARV',
                  'landsat-crop', 
                  'LC080130302017072301T1-SC20181023152048', 
                  '*band?????'))

['ndvi-automation/sites/HARV/landsat-crop/LC080130302017072301T1-SC20181023152048/LC08_L1TP_013030_20170723_20180125_01_T1_sr_band2.tif',
 'ndvi-automation/sites/HARV/landsat-crop/LC080130302017072301T1-SC20181023152048/LC08_L1TP_013030_20170723_20180125_01_T1_sr_band3.tif',
 'ndvi-automation/sites/HARV/landsat-crop/LC080130302017072301T1-SC20181023152048/LC08_L1TP_013030_20170723_20180125_01_T1_sr_band1.tif',
 'ndvi-automation/sites/HARV/landsat-crop/LC080130302017072301T1-SC20181023152048/LC08_L1TP_013030_20170723_20180125_01_T1_sr_band4.tif',
 'ndvi-automation/sites/HARV/landsat-crop/LC080130302017072301T1-SC20181023152048/LC08_L1TP_013030_20170723_20180125_01_T1_sr_band5.tif']

## Grab Parts of a Directory Path

There are several ways that you can grab just a part of a path. Sometimes a file path has metadata in it that can be useful for creating useful variable names in your script. In your NDVI workflow, you may want to grab the site name from the directory path to use for your workflow. 

You can use a combination of `normpath()` and `basename()` functions from `os.path` to access the last directory in a path. In your case, this path contains your site name!


In [33]:
# Example of normpath cleaning up path
example_path = "/home/user/../example_dir"
os.path.normpath(example_path)

'/home/example_dir'

In [44]:
# Use normpath and basename together to get the last directory name
# This will be helpful for separating  data form  HARV  vs  SJER
sitename = os.path.basename(os.path.normpath(site_files))
sitename

'HARV'

There are endless ways to use the sitename as a variable in an automated workflow.

In [45]:
# Create a file name needed to open a file
print(os.path.join(site_files, "vector", sitename + "-crop.shp"))

# Create a generic output path to an output csv file
print(os.path.join("ndvi-automation", "outputs", sitename + "-ndvi.csv"))

ndvi-automation/sites/HARV/vector/HARV-crop.shp
ndvi-automation/outputs/HARV-ndvi.csv


If you want to grab both the last directory name and the path prior to that directory, you can use `os.path.split` with `normpath()`.

In [48]:
os.path.split(os.path.normpath(site_files))

('ndvi-automation/sites', 'HARV')

## Parse Text From Directory Names

There are numerous options to parse text from a file path. In your homework, you need to grab the date when each Landsat scene was collected. To grab just the date from the directory, you will need to:

1. get the full directory path
2. find the date embedded within the path name

If you refer back to the Landsat metadata, you will see that every scene has the same naming convention. 

This means that you can count the characters (i.e. indices) in the directory name to find the collection date (which is the first date in the string) and use the same indices for every scene!

In this case, you can find the date using a string index like this:

`astring[startindex:endindex]`

In [49]:
# View directory name
dir_name = os.path.basename(os.path.normpath(adir))

In [50]:
# Get landsat date from directory name
date = dir_name[10:18]
date

'20170909'

You can also break the entire path apart, if you need to do so, using `string_name.split()`.

`.split()` is a built in python function that splits a string into a list of strings based on a seperator 
character. For file paths, `os.sep` is a system friendly way to seperate file paths into their base parts. 

In [51]:
# Break paths into components
path = os.path.normpath(adir)
path.split(os.sep)

['ndvi-automation',
 'sites',
 'HARV',
 'landsat-crop',
 'LC080130302017090901T1-SC20181023151921']

As you see, `string_name.split()` produces a list that you can query to get a specific component.

In [52]:
# Get the site name from the path
path_components = path.split(os.sep)
path_components[2]

'HARV'

## Another approach: regular expressions

A reminder:
### Landsat File Naming Convention

Landsat and many other satellite remote sensing data is named in a way that tells you a about:

* When the data were collected and processed
* What sensor was used to collect the data
* What satellite was used to collect the data.

And more. 

Here you will learn a few key components of the landsat 8 collection file name. The first scene that you work with below is named:

`LC080340322016072301T1-SC20180214145802`

First, we have LC08

* **L:** Landsat Sensor
* **C:** OLI / TIRS combined platform
* **08:** Landsat 8 (not 7)

* **034032:** The next 6 digits represent the path and row of the scene. This identifies the spatial coverage of the scene

Finally, you have a date. In your case as follows:

* **20160723:** representing the year, month and day that the data were collected.

The second part of the file name above tells you more about when the data were last processed. You can read more about this naming convention using the link below.

<a href="https://landsat.usgs.gov/what-are-naming-conventions-landsat-scene-identifiers" target="_blank">Learn more about Landsat 8 file naming conventions.</a>

As you work wtih these data, it is good to double check that you are working with the sensor (Landsat 8) and the time period that you intend. Having this information in the file name makes it easier to keep track of this as you process your data. 

In [74]:
# Import at the top!
import re

path_components[-1]

l8_re = re.compile(r'LC08\d{6}(?P<date>\d{8})')

# Define the directory name
landsat_dir = "landsat-crop"

# Loop through each site directory
for site_dir in all_sites:
    print("I am looping through", site_files)

    # Get a list of subdirectories for that site
    all_dirs = glob(os.path.join(site_dir, 'landsat-crop', '*'))
    #  Loop through  each subdirectory where your data are stored
    for adir in all_dirs:
        l8_info = re.match(l8_re, adir.split(os.sep)[-1])
        print('processing date: {}'.format(l8_info.group('date')))

I am looping through ndvi-automation/sites/HARV/
processing date: 20170904
processing date: 20170819
processing date: 20171022
processing date: 20171107
processing date: 20170702
processing date: 20170107
processing date: 20170920
processing date: 20170515
processing date: 20170429
processing date: 20171123
processing date: 20170616
processing date: 20170312
processing date: 20170208
processing date: 20170803
processing date: 20170123
processing date: 20171209
processing date: 20171225
processing date: 20170531
processing date: 20171006
processing date: 20170224
processing date: 20170413
processing date: 20170328
processing date: 20170718
I am looping through ndvi-automation/sites/HARV/
processing date: 20170418
processing date: 20170317
processing date: 20170707
processing date: 20170213
processing date: 20170824
processing date: 20170808
processing date: 20170301
processing date: 20171128
processing date: 20171027
processing date: 20170504
processing date: 20170925
processing date: 2

##  Loops  to Create Lists and  DataFrames

For  this workflow, you will want to capture NDVI data and  
ultimately  produce a DataFrame that can be used to plot that  
data by site and date.

In [26]:
all_sites

['ndvi-automation/sites/SJER/', 'ndvi-automation/sites/HARV/']

In [87]:
# Define the directory name
landsat_dir = "landsat-crop"

# Create an empty list
ndvi_list =  []
# Loop through each site directory
for site_dir in all_sites:
    print("I am looping through", site_dir)
    asite = os.path.normpath(site_dir).split(os.sep)[-1]
    print("I am working on the", asite, "field site now")

    # Get a list of subdirectories for that site
    data_dirs = sorted(glob(os.path.join(site_dir, landsat_dir, '*')))

    #  Loop through  each subdirectory where your data are stored
    for adir in data_dirs:
        print("Now processing", adir)
        # Calculate  NDVI
        ndvi = 'THE NDVI'
        # Capture  the site name, and  date in  a list
        output = [asite, 'DATE', ndvi]
        ndvi_list.append(output)

I am looping through ndvi-automation/sites/SJER/
I am working on the SJER field site now
Now processing ndvi-automation/sites/SJER/landsat-crop/LC080420342017010701T2-SC20181023153321
Now processing ndvi-automation/sites/SJER/landsat-crop/LC080420342017012301T1-SC20181023170015
Now processing ndvi-automation/sites/SJER/landsat-crop/LC080420342017020801T1-SC20181023162521
Now processing ndvi-automation/sites/SJER/landsat-crop/LC080420342017022401T1-SC20181023152103
Now processing ndvi-automation/sites/SJER/landsat-crop/LC080420342017031201T1-SC20181023152108
Now processing ndvi-automation/sites/SJER/landsat-crop/LC080420342017032801T1-SC20181023162825
Now processing ndvi-automation/sites/SJER/landsat-crop/LC080420342017041301T1-SC20181023170020
Now processing ndvi-automation/sites/SJER/landsat-crop/LC080420342017042901T1-SC20181023153144
Now processing ndvi-automation/sites/SJER/landsat-crop/LC080420342017051501T1-SC20181023151959
Now processing ndvi-automation/sites/SJER/landsat-crop/L

In [88]:
# Example  Output  List
ndvi_list

[['SJER', 'DATE', 'THE NDVI'],
 ['SJER', 'DATE', 'THE NDVI'],
 ['SJER', 'DATE', 'THE NDVI'],
 ['SJER', 'DATE', 'THE NDVI'],
 ['SJER', 'DATE', 'THE NDVI'],
 ['SJER', 'DATE', 'THE NDVI'],
 ['SJER', 'DATE', 'THE NDVI'],
 ['SJER', 'DATE', 'THE NDVI'],
 ['SJER', 'DATE', 'THE NDVI'],
 ['SJER', 'DATE', 'THE NDVI'],
 ['SJER', 'DATE', 'THE NDVI'],
 ['SJER', 'DATE', 'THE NDVI'],
 ['SJER', 'DATE', 'THE NDVI'],
 ['SJER', 'DATE', 'THE NDVI'],
 ['SJER', 'DATE', 'THE NDVI'],
 ['SJER', 'DATE', 'THE NDVI'],
 ['SJER', 'DATE', 'THE NDVI'],
 ['SJER', 'DATE', 'THE NDVI'],
 ['SJER', 'DATE', 'THE NDVI'],
 ['SJER', 'DATE', 'THE NDVI'],
 ['SJER', 'DATE', 'THE NDVI'],
 ['SJER', 'DATE', 'THE NDVI'],
 ['SJER', 'DATE', 'THE NDVI'],
 ['HARV', 'DATE', 'THE NDVI'],
 ['HARV', 'DATE', 'THE NDVI'],
 ['HARV', 'DATE', 'THE NDVI'],
 ['HARV', 'DATE', 'THE NDVI'],
 ['HARV', 'DATE', 'THE NDVI'],
 ['HARV', 'DATE', 'THE NDVI'],
 ['HARV', 'DATE', 'THE NDVI'],
 ['HARV', 'DATE', 'THE NDVI'],
 ['HARV', 'DATE', 'THE NDVI'],
 ['HARV'

In [89]:
#  Import should  be at the top !
import pandas as  pd

#  Create  final  dataframe
pd.DataFrame(ndvi_list)

Unnamed: 0,0,1,2
0,SJER,DATE,THE NDVI
1,SJER,DATE,THE NDVI
2,SJER,DATE,THE NDVI
3,SJER,DATE,THE NDVI
4,SJER,DATE,THE NDVI
5,SJER,DATE,THE NDVI
6,SJER,DATE,THE NDVI
7,SJER,DATE,THE NDVI
8,SJER,DATE,THE NDVI
9,SJER,DATE,THE NDVI


In [90]:
#  Create final dataframe and rename columns
pd.DataFrame(ndvi_list,
             columns=["site","date","ndvi"])

Unnamed: 0,site,date,ndvi
0,SJER,DATE,THE NDVI
1,SJER,DATE,THE NDVI
2,SJER,DATE,THE NDVI
3,SJER,DATE,THE NDVI
4,SJER,DATE,THE NDVI
5,SJER,DATE,THE NDVI
6,SJER,DATE,THE NDVI
7,SJER,DATE,THE NDVI
8,SJER,DATE,THE NDVI
9,SJER,DATE,THE NDVI
