# Week 4: lists and subloops

Sequences have order, meaning that individual items have specific positions.  You can use those positions:

* as their explicit meaning (so the numbers are directily meaningful)
* as a transformed meaning (so you can add 1 or do something else to the position number to make it meaningful)
* with a referenced meaning (such that you can use the position number to look up the meaning).

These items can also be individually manipulated.  Once referenced, they can be stored in memory, placed in another structure, and used as a source for further data.  

# Problem statement

Given a data file of data repository records and extract the number of downloads.  Calculate the total number of dataset downloads.

In [1]:
# we'll be discussing reading in files later!

f = open('report.txt', 'r')

full_text = f.read()

f.close()

Here's a small snippet we can see in one screen:

In [2]:
sample = """Kozuch, Laura; Walker, Karen; Marquardt, William (2017): Modern sinistral whelk spire angles, genus Busycon. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2031816_V1
Downloads: 9 (2017-08-30 to 2017-09-13 )
-----

Park, Jungsik; Le, Brian; Sklenar, Joseph; Chern, Gia-wei; Watts, Justin; Schiffer, Peter (2017): Magnetic response of brickwork artificial spin ice. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-1528275_V1
Funder: U.S. Department of Energy (DOE), Grant: DE-SC0010778
Downloads: 10 (2017-09-08 to 2017-09-13 )
-----

Kozuch, Laura; Walker, Karen; Marquardt, William (2017): Modern sinistral whelk spire angles, genus Busycon . University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2031816_V2
Downloads: 6 (2017-09-06 to 2017-09-13 )
-----

Christensen, Sarah; Molloy, Erin K.; Vachaspati, Pranjal; Warnow, Tandy (2017): Datasets from the study: Optimal completion of incomplete gene trees in polynomial time using OCTAL. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-8402610_V1
Funder: U.S. National Science Foundation (NSF), Grant: CCF-1535977
Funder: U.S. National Science Foundation (NSF), Grant: DGE-1144245
Downloads: 47 (2017-06-15 to 2017-09-13 )
-----"""

# What's our structure here?

We've got data on datasets here, but this isn't a tabular data file.  We can't open it up in excel or directly do computations on it, yet.  We need to transform it into something with a standard sturucture for analysis.

A few things to note:

* the number of lines per entry is variable, so we can't exploit a steady mathematical structure like we did with the Raven last week.
* even if variable, we can see that the lines inside each entry are meaningful 
* each line has a field label followed by a : and then content
* some of these fields have multiple entries (so multiple lines with the same field entry)
* some of the line content has multiple values
* the string "-----" appears (to appear!) between each entry

# What's our data granularity?

Last week we explored a poem and found that our unit of analysis of interest are the individual lines.  That made it easy because python knows about lines and has functions designed to easily interact with them.  In this case, our individual data entities records about the dataset.  This sort of thing, especially with the variability in size, is not something that python is able to directly interact with.  Our recourse for this issue is to use the tools inside of python to encode these chunks as individual data entries.  Once we have the individual records out we can operate on them independently and exract the information out.

This pattern of breaking the data apart so that we can apply broader (easier) methods of splitting individual data points out will be a common one.

Let's take a moment to consider the Raven again.  Say that we want to know about the words in each stanza.  We could use regular `.split()` on the entire poem and get all the words.  We'd have the granularity that we want (the words) but the membership information would then be gone.

Recall our basic for loop over lines through the poem:  it allows us to isolate each line at a time such that we can manupulate or take measurements from that line and infer that those measurements we get belong to that line because it is the one we are looping over at the moment.

Our example of why using the `range(0, len(raven_lines), 7)` function to make the position numbers and look up the line versus just doing a list slice (`::7`) gave us the same content back (the first line of each stanza), but using the list slicing method lost the line number information in the process.  That origin information could not be derived back from the individual line itself, and thus we depend on our iterable variable within the `range` loop to represent that component metadata about the line.

Unlike our pattern where we used `range` to generate the line numbers, the content (the numbers) that we generated with `range` wasn't directly related to our original content.  We used a known pattern to (correctly!) generate postiton numbers.  So there was a certain trust that we had to put into place to make the things work.

This time around we are splitting our data into chunks so that we can individually act on the data inside of it.  This means that instead of getting out all the lines that have the funder information and doing something fancy to figure out which chunk it was from, we can take out all the chunks and then get the funder lines from there.  Becuase we are isolating the data records from eachother, we can use a pretty unfancy method of getting out the lines that we want.  

For example, the last line of each record is the download count for that dataset.  That location rule would be impossible to use if we weren't isolating each chunk.

So to answer our question:  we have several granularities.

1. Each data record
2. Each line
3. Each data point in each line

# What's the magic word that we see in that list?

**`each`**

We aren't going to tackle a triple nested loop to start with, but we need to start somewhere.  Let's give ourselves a little to do list:

1. Get all the data records
2. For each record, get the lines
3. For each line of interest, get the data of interest

Don't worry, we're going to do this one at a time.  We won't actually need to do a triple nested loop becuase we can store our intermediate results.

# Task 1:  Get the records

Not shockingly, we're going to start with `.split()`.  Remember that we need two things for this:

1.  A string to split apart
    * Got this covered:  `original`
2.  Something in that string to split it apart on
    * We have a good theory here: `-----` but we need to confirm
    
We can visually inspect our file and see that this appears to be between each record.  Not only that, we can look all the way to the end and see that it isn't just between the file but it appears at the end of each record.  So there isn't one before and there is one at the end, meaning that we can expect to see the appearance of 56 instances (so one for every expected record).

We can do a quick string method to count how many times it appears in the file.  We'll try it first on our small sample to see it working, and then we'll deploy it onto the whole document that we have stored in memory.

We can visually inspect our sample to see that there are 4 entries, so we can expect (hope?) to see a result of 4 when we run our candidate delimiter through the string function that counts how many appearances it makes.

In [3]:
print(sample.count("-----"))

4


Great, we can see that we found the expected number of instances in our sample.  Let's try it on our full version.  Remember that we are hoping to see 56.

In [4]:
print(full_text.count('-----'))

56


Yay!  Now we can see what the results are of running this through `.split()`.

In [5]:
sample_split = sample.split('-----')

print("There are", len(sample_split), "records")

print(sample_split)

There are 5 records
['Kozuch, Laura; Walker, Karen; Marquardt, William (2017): Modern sinistral whelk spire angles, genus Busycon. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2031816_V1\nDownloads: 9 (2017-08-30 to 2017-09-13 )\n', '\n\nPark, Jungsik; Le, Brian; Sklenar, Joseph; Chern, Gia-wei; Watts, Justin; Schiffer, Peter (2017): Magnetic response of brickwork artificial spin ice. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-1528275_V1\nFunder: U.S. Department of Energy (DOE), Grant: DE-SC0010778\nDownloads: 10 (2017-09-08 to 2017-09-13 )\n', '\n\nKozuch, Laura; Walker, Karen; Marquardt, William (2017): Modern sinistral whelk spire angles, genus Busycon . University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2031816_V2\nDownloads: 6 (2017-09-06 to 2017-09-13 )\n', '\n\nChristensen, Sarah; Molloy, Erin K.; Vachaspati, Pranjal; Warnow, Tandy (2017): Datasets from the study: Optimal completion of incomple

That length of 5 is a little concerning, but we can take a look and see that the last element is an empty string.  Sometimes this happens when the string we are splitting on appears at the end.  As you can see in the below example.

In [6]:
print('a-b-c-d-'.split("-"))

['a', 'b', 'c', 'd', '']


So our split has worked, but as usual there's a little more fussing we can do to make it better.  This is a good example of a situation where you won't know exactly what you'll need to do until you get the contents loaded and start working with it.

Look at the the sections in the list where the strings should be separated.  Here's a snippet of what I want you to see:

`2017-09-13 )`**`\n', '\n\n`**`Christensen,`

We haven't mentioned it before, but we've been passing a string with multiple characters in it to split and it has been using it as a single item to split on.  This meann that we can add more to the string that we are passing it and see how what changes things.  From this example, we can see that there are newline characters (`\n`) surrounding the delimiter, 1 before and 2 after.  If we look closer at the actual file we can see thse characters in action.  The `-----` appears on its own line (so that's the 1 before newline), and there is an extra empty line just below it (so that's the 2 after).  We can try including that in our split.  This change may provide two impacts:  it'll clean up the results a bit (we could always use `.strip()` on them, so that wasn't really a concern.  But it may also get rid of that trailing empty string in our results.

So I'm going to copy over the code for our previous example and just add those characters in.

In [12]:
sample_split = sample.split('\n-----\n\n')

print("There are", len(sample_split), "records")

print(sample_split)

There are 4 records
['Kozuch, Laura; Walker, Karen; Marquardt, William (2017): Modern sinistral whelk spire angles, genus Busycon. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2031816_V1\nDownloads: 9 (2017-08-30 to 2017-09-13 )', 'Park, Jungsik; Le, Brian; Sklenar, Joseph; Chern, Gia-wei; Watts, Justin; Schiffer, Peter (2017): Magnetic response of brickwork artificial spin ice. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-1528275_V1\nFunder: U.S. Department of Energy (DOE), Grant: DE-SC0010778\nDownloads: 10 (2017-09-08 to 2017-09-13 )', 'Kozuch, Laura; Walker, Karen; Marquardt, William (2017): Modern sinistral whelk spire angles, genus Busycon . University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2031816_V2\nDownloads: 6 (2017-09-06 to 2017-09-13 )', 'Christensen, Sarah; Molloy, Erin K.; Vachaspati, Pranjal; Warnow, Tandy (2017): Datasets from the study: Optimal completion of incomplete gene trees in p

There's a lot of information happening in this readout, so we'll have to look closely to see what's going on.  Looking between the elements, we can see that the breaks are pretty clean, but the last element is still holding a copy of our delimiter.  This is because it has the `\n-----` but is missing the final `\n\n` thus is not removed.

We can play with removing some of the new lines in the split and see if that helps.

In [13]:
sample_split = sample.split('\n-----')

print("There are", len(sample_split), "records")

print(sample_split)

There are 5 records
['Kozuch, Laura; Walker, Karen; Marquardt, William (2017): Modern sinistral whelk spire angles, genus Busycon. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2031816_V1\nDownloads: 9 (2017-08-30 to 2017-09-13 )', '\n\nPark, Jungsik; Le, Brian; Sklenar, Joseph; Chern, Gia-wei; Watts, Justin; Schiffer, Peter (2017): Magnetic response of brickwork artificial spin ice. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-1528275_V1\nFunder: U.S. Department of Energy (DOE), Grant: DE-SC0010778\nDownloads: 10 (2017-09-08 to 2017-09-13 )', '\n\nKozuch, Laura; Walker, Karen; Marquardt, William (2017): Modern sinistral whelk spire angles, genus Busycon . University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2031816_V2\nDownloads: 6 (2017-09-06 to 2017-09-13 )', '\n\nChristensen, Sarah; Molloy, Erin K.; Vachaspati, Pranjal; Warnow, Tandy (2017): Datasets from the study: Optimal completion of incomplete gen

Removing the trailing two `\n` characters allows the delimiter to go away, but now we have that empty line appearing again.  

We have to choose:  do we want to deal with removing the `-----` at the end or have to strip the whitespace off and remove the last empty string.  We could also alter our original text to fix this last delimiter to look like the others.

In [21]:
sample_fixed = sample + "\n\n"

sample_split = sample_fixed.split('\n-----\n\n')

print("There are", len(sample_split), "records")

print(sample_split)

There are 5 records
['Kozuch, Laura; Walker, Karen; Marquardt, William (2017): Modern sinistral whelk spire angles, genus Busycon. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2031816_V1\nDownloads: 9 (2017-08-30 to 2017-09-13 )', 'Park, Jungsik; Le, Brian; Sklenar, Joseph; Chern, Gia-wei; Watts, Justin; Schiffer, Peter (2017): Magnetic response of brickwork artificial spin ice. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-1528275_V1\nFunder: U.S. Department of Energy (DOE), Grant: DE-SC0010778\nDownloads: 10 (2017-09-08 to 2017-09-13 )', 'Kozuch, Laura; Walker, Karen; Marquardt, William (2017): Modern sinistral whelk spire angles, genus Busycon . University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2031816_V2\nDownloads: 6 (2017-09-06 to 2017-09-13 )', 'Christensen, Sarah; Molloy, Erin K.; Vachaspati, Pranjal; Warnow, Tandy (2017): Datasets from the study: Optimal completion of incomplete gene trees in p

Adding the extra newlines fixes the split, but we've got that extra string in there.  We could always remove tho last element, so long as we are sure that the last item is empty.  That kind of check will be something we can do when we start with boolean logic and decision structures.  For now, we can explore an alternative path.

We could also try to remove it, but the content is a subset of our larger delimeter, so removing just what it is would remove all our delimeters.

We could also remove the last 5 characters from the string, which will do the same.

For the sake of practicing our list methods, we're going to explore `.pop()`.  With more advanced tests we could put in a check that it really is a string of length 0, but for now we can at least visually inspect what we are removing.

In [22]:
sample_split.pop(-1) # we want the last one, so our -1 friend will come back

''

`.pop()` will mutate our original list, so you see how there is no assignment statement happening here.  In fact, if we try to reassign our list to the results of `pop` we will have erased all the data we want with the data that we are removing. 

We can see this in action.

In [25]:
sample_fixed = sample + "\n\n"

sample_split = sample_fixed.split('\n-----\n\n')

print("There are", len(sample_split), "records")

print(sample_split)

sample_split = sample_split.pop(-1)

print("and now our data is:", sample_split)

There are 5 records
['Kozuch, Laura; Walker, Karen; Marquardt, William (2017): Modern sinistral whelk spire angles, genus Busycon. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2031816_V1\nDownloads: 9 (2017-08-30 to 2017-09-13 )', 'Park, Jungsik; Le, Brian; Sklenar, Joseph; Chern, Gia-wei; Watts, Justin; Schiffer, Peter (2017): Magnetic response of brickwork artificial spin ice. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-1528275_V1\nFunder: U.S. Department of Energy (DOE), Grant: DE-SC0010778\nDownloads: 10 (2017-09-08 to 2017-09-13 )', 'Kozuch, Laura; Walker, Karen; Marquardt, William (2017): Modern sinistral whelk spire angles, genus Busycon . University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2031816_V2\nDownloads: 6 (2017-09-06 to 2017-09-13 )', 'Christensen, Sarah; Molloy, Erin K.; Vachaspati, Pranjal; Warnow, Tandy (2017): Datasets from the study: Optimal completion of incomplete gene trees in p

In [32]:
sample_fixed = sample + "\n\n"

sample_split = sample_fixed.split('\n-----\n\n')

print("There are", len(sample_split), "records")

sample_split.pop(-1)

print(sample_split)

print("There are now", len(sample_split), "records")

There are 5 records
['Kozuch, Laura; Walker, Karen; Marquardt, William (2017): Modern sinistral whelk spire angles, genus Busycon. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2031816_V1\nDownloads: 9 (2017-08-30 to 2017-09-13 )', 'Park, Jungsik; Le, Brian; Sklenar, Joseph; Chern, Gia-wei; Watts, Justin; Schiffer, Peter (2017): Magnetic response of brickwork artificial spin ice. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-1528275_V1\nFunder: U.S. Department of Energy (DOE), Grant: DE-SC0010778\nDownloads: 10 (2017-09-08 to 2017-09-13 )', 'Kozuch, Laura; Walker, Karen; Marquardt, William (2017): Modern sinistral whelk spire angles, genus Busycon . University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2031816_V2\nDownloads: 6 (2017-09-06 to 2017-09-13 )', 'Christensen, Sarah; Molloy, Erin K.; Vachaspati, Pranjal; Warnow, Tandy (2017): Datasets from the study: Optimal completion of incomplete gene trees in p

# Step 2: split the records apart

So let's back up and consider what we have done and still need to do.

We've got a list of our individual records.  Next step:  go over each record and get the line with the downloads data.

As we have explored, we know that the downloads line is the last line of each record (now do you see why I gave this problem statement?)

Let's grab a single record to play with and get a proof of concept going.  Once we're happy with how we are splitting the record up, then we can integrate that into a for loop.  We know that we want the lines out, which we've already explored how we do that with a `str.split('\n')`

In [34]:
print(sample_split[2])

Kozuch, Laura; Walker, Karen; Marquardt, William (2017): Modern sinistral whelk spire angles, genus Busycon . University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2031816_V2
Downloads: 6 (2017-09-06 to 2017-09-13 )


In [35]:
print(sample_split[2].split('\n'))

['Kozuch, Laura; Walker, Karen; Marquardt, William (2017): Modern sinistral whelk spire angles, genus Busycon . University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2031816_V2', 'Downloads: 6 (2017-09-06 to 2017-09-13 )']


This is a smaller record, but we can see that we have the citation as one element, then the downloads element is indeed the last one.

This seems good enough to deploy on our entire sample.  Remember that we should start small and just print out the basics first.

In [36]:
for record in sample_split:
    print(record.split('\n'))

['Kozuch, Laura; Walker, Karen; Marquardt, William (2017): Modern sinistral whelk spire angles, genus Busycon. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2031816_V1', 'Downloads: 9 (2017-08-30 to 2017-09-13 )']
['Park, Jungsik; Le, Brian; Sklenar, Joseph; Chern, Gia-wei; Watts, Justin; Schiffer, Peter (2017): Magnetic response of brickwork artificial spin ice. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-1528275_V1', 'Funder: U.S. Department of Energy (DOE), Grant: DE-SC0010778', 'Downloads: 10 (2017-09-08 to 2017-09-13 )']
['Kozuch, Laura; Walker, Karen; Marquardt, William (2017): Modern sinistral whelk spire angles, genus Busycon . University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-2031816_V2', 'Downloads: 6 (2017-09-06 to 2017-09-13 )']
['Christensen, Sarah; Molloy, Erin K.; Vachaspati, Pranjal; Warnow, Tandy (2017): Datasets from the study: Optimal completion of incomplete gene trees in polynomial

So far so good, but how can we tell if the downloads line is indeed the last line of each?  We could just print out the lasti line of each record and visually inspect that each line starts with downloads.

In [37]:
for record in sample_split:
    record_split = record.split('\n')
    print(record_split[-1])

Downloads: 9 (2017-08-30 to 2017-09-13 )
Downloads: 10 (2017-09-08 to 2017-09-13 )
Downloads: 6 (2017-09-06 to 2017-09-13 )
Downloads: 47 (2017-06-15 to 2017-09-13 )


We have 4 records and now we have 4 download lines.  So this step seems to be taken care of and now this seems to be a good time to test this on all of our data.  Let's also go ahead and combine all these things together.

In [38]:
full_text_fixed = full_text + "\n\n"

full_text_fixed_split = full_text_fixed.split('\n-----\n\n')

print("There are", len(full_text_fixed_split), "records")

full_text_fixed_split.pop(-1)

print("There are now", len(full_text_fixed_split), "records")

There are 57 records
There are now 56 records


Good!  We have 56 records.  Lets start looping.

In [39]:
for record in full_text_fixed_split:
    record_split = record.split('\n')
    print(record_split[-1])

Downloads: 9 (2017-08-30 to 2017-09-13 )
Downloads: 10 (2017-09-08 to 2017-09-13 )
Downloads: 6 (2017-09-06 to 2017-09-13 )
Downloads: 47 (2017-06-15 to 2017-09-13 )
Downloads: 99 (2016-12-18 to 2017-09-13 )
Downloads: 3 (2017-12-12 to 2017-09-13 )
Downloads: 30 (2016-12-12 to 2017-09-13 )
Downloads: 42 (2016-12-12 to 2017-09-13 )
Downloads: 93 (2016-12-12 to 2017-09-13 )
Downloads: 31 (2016-12-12 to 2017-09-13 )
Downloads: 50 (2016-12-12 to 2017-09-13 )
Downloads: 12 (2017-08-11 to 2017-09-13 )
Downloads: 0 (2017-08-21 to 2017-09-13 )
Downloads: 99 (2017-07-29 to 2017-09-13 )
Downloads: 931 (2017-06-28 to 2017-09-13 )
Downloads: 28 (2017-06-16 to 2017-09-13 )
Downloads: 26 (2017-06-16 to 2017-09-13 )
Downloads: 28 (2017-06-16 to 2017-09-13 )
Downloads: 40 (2017-06-01 to 2017-09-13 )
Downloads: 68 (2017-06-01 to 2017-09-13 )
Downloads: 18 (2017-05-01 to 2017-09-13 )
Downloads: 41 (2017-05-31 to 2017-09-13 )
Downloads: 516 (2016-06-23 to 2017-09-13 )
Downloads: 72 (2017-05-22 to 2017-09

Visual inspection of the results says that we've got this part done.

# Step 3: get the downloads number

At this point we've got a good set of results. We've isolated our records and inside each record isolated each line.  With those lines now accessible via position number, we can isolate the last line of the record, which is the line with the data that we want.

So let's now consider each line and sort out a way to get those download numbers out.  We've got a string in here, but our granularity is at the level of the word.  While we don't have just words in here, we've got stuff sepapated by white space.  Instead of a `-----` or `\n` delimiter, we have a single space.

Back once again with our friend `.split()`.  But what do we use in this case?  We care about white spaces, which happens te be the default for `.split()`.  Let's just copy one of these lines in and play with splitting the content.

In [41]:
line = "Downloads: 118 (2016-06-23 to 2017-09-13 )"

print(line.split())

['Downloads:', '118', '(2016-06-23', 'to', '2017-09-13', ')']


That seems to have done a part of the job.  We've split our line into a series of elements, one of which is actually the data point that we want.  Can we exploit some consistancy here?  Let's first just look at what this split does across all the download lines.

In [42]:
for record in full_text_fixed_split:
    record_split = record.split('\n')
    dowload_line = record_split[-1]
    print(dowload_line.split())

['Downloads:', '9', '(2017-08-30', 'to', '2017-09-13', ')']
['Downloads:', '10', '(2017-09-08', 'to', '2017-09-13', ')']
['Downloads:', '6', '(2017-09-06', 'to', '2017-09-13', ')']
['Downloads:', '47', '(2017-06-15', 'to', '2017-09-13', ')']
['Downloads:', '99', '(2016-12-18', 'to', '2017-09-13', ')']
['Downloads:', '3', '(2017-12-12', 'to', '2017-09-13', ')']
['Downloads:', '30', '(2016-12-12', 'to', '2017-09-13', ')']
['Downloads:', '42', '(2016-12-12', 'to', '2017-09-13', ')']
['Downloads:', '93', '(2016-12-12', 'to', '2017-09-13', ')']
['Downloads:', '31', '(2016-12-12', 'to', '2017-09-13', ')']
['Downloads:', '50', '(2016-12-12', 'to', '2017-09-13', ')']
['Downloads:', '12', '(2017-08-11', 'to', '2017-09-13', ')']
['Downloads:', '0', '(2017-08-21', 'to', '2017-09-13', ')']
['Downloads:', '99', '(2017-07-29', 'to', '2017-09-13', ')']
['Downloads:', '931', '(2017-06-28', 'to', '2017-09-13', ')']
['Downloads:', '28', '(2017-06-16', 'to', '2017-09-13', ')']
['Downloads:', '26', '(2017

Visual inspection seems to indicate that there is a consistancy that we want.  