<img align="left" src="https://lever-client-logos.s3.amazonaws.com/864372b1-534c-480e-acd5-9711f850815c-1524247202159.png" width=200>
<br></br>
<br></br>

# Major Neural Network Architectures Challenge
## *Data Science Unit 4 Sprint 3 Challenge*

In this sprint challenge, you'll explore some of the cutting edge of Data Science. This week we studied several famous neural network architectures: 
recurrent neural networks (RNNs), long short-term memory (LSTMs), convolutional neural networks (CNNs), and Autoencoders. In this sprint challenge, you will revisit these models. Remember, we are testing your knowledge of these architectures not your ability to fit a model with high accuracy. 

__*Caution:*__  these approaches can be pretty heavy computationally. All problems were designed so that you should be able to achieve results within at most 5-10 minutes of runtime locally, on AWS SageMaker, on Colab or on a comparable environment. If something is running longer, double check your approach!

## Challenge Objectives
*You should be able to:*
* <a href="#p1">Part 1</a>: Train a LSTM classification model
* <a href="#p2">Part 2</a>: Utilize a pre-trained CNN for object detection
* <a href="#p3">Part 3</a>: Describe a use case for an autoencoder
* <a href="#p4">Part 4</a>: Describe yourself as a Data Science and elucidate your vision of AI

<a id="p1"></a>
## Part 1 - LSTMSs

Use a LSTM to fit a multi-class classification model on Reuters news articles to distinguish topics of articles. The data is already encoded properly for use in a LSTM model. 

Your Tasks: 
- Use Keras to fit a predictive model, classifying news articles into topics. 
- Report your overall score and accuracy

For reference, the [Keras IMDB sentiment classification example](https://github.com/keras-team/keras/blob/master/examples/imdb_lstm.py) will be useful, as well as the LSTM code we used in class.

__*Note:*__  Focus on getting a running model, not on maxing accuracy with extreme data size or epoch numbers. Only revisit and push accuracy if you get everything else done!

In [1]:
from tensorflow.keras.datasets import reuters

(X_train, y_train), (X_test, y_test) = reuters.load_data(num_words=None,
                                                         skip_top=0,
                                                         maxlen=None,
                                                         test_split=0.2,
                                                         seed=723812,
                                                         start_char=1,
                                                         oov_char=2,
                                                         index_from=3)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/reuters.npz


In [2]:
# Demo of encoding

word_index = reuters.get_word_index(path="reuters_word_index.json")

print(f"Iran is encoded as {word_index['iran']} in the data")
print(f"London is encoded as {word_index['london']} in the data")
print("Words are encoded as numbers in our dataset.")

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/reuters_word_index.json
Iran is encoded as 779 in the data
London is encoded as 544 in the data
Words are encoded as numbers in our dataset.


In [3]:
# Do not change this line. You need the +1 for some reason. 
max_features = len(word_index.values()) + 1

In [7]:
# TODO - your code!
print(X_train.shape)
X_train[:2]



(8982,)


array([list([1, 248, 409, 166, 265, 1537, 1662, 8, 24, 4, 1222, 2771, 7, 227, 236, 40, 85, 944, 10, 531, 176, 8, 4, 176, 1613, 24, 1662, 297, 5157, 6, 10, 103, 5, 231, 215, 8, 7, 2889, 6, 10, 1202, 69, 4, 1222, 329, 2771, 24, 944, 23, 944, 1662, 40, 2509, 1592, 907, 69, 4, 113, 997, 762, 2539, 7, 227, 236, 17, 12]),
       list([1, 4665, 1183, 413, 381, 7, 1134, 1664, 62, 729, 7, 4, 121, 273, 93, 109, 28, 2115, 72, 11, 428, 4, 387, 989, 558, 3956, 8, 7, 25, 1213, 427, 1969, 223, 4, 213, 5, 387, 580, 8, 1145, 413, 62, 410, 451, 18, 428, 7, 4, 121, 6, 3106, 19, 11, 428, 9, 1283, 317, 65, 413, 138, 59, 12, 11, 428, 6, 6118, 63, 11, 4, 3956, 8, 3640, 1183, 413, 202, 251, 18, 428, 6, 546, 19, 11, 428, 9, 317, 65, 413, 7, 4, 1721, 427, 409, 7145, 138, 19, 19, 11, 428, 6, 3843, 70, 11, 4, 135, 5, 137, 317, 1833, 542, 9, 7145, 413, 138, 72, 47, 11, 428, 6, 19, 5106, 19, 16, 8, 17, 12])],
      dtype=object)

In [8]:
from tensorflow.keras.preprocessing import sequence
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Embedding, LSTM

In [10]:
maxlen = 80
batch_size = 32

print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')

8982 train sequences
2246 test sequences


In [12]:
print('Pad Sequences (samples x time)')
X_train = sequence.pad_sequences(X_train, maxlen=maxlen)
X_test = sequence.pad_sequences(X_test, maxlen=maxlen)
print('X_train shape: ', X_train.shape)
print('X_test shape: ', X_test.shape)

Pad Sequences (samples x time)
X_train shape:  (8982, 80)
X_test shape:  (2246, 80)


In [13]:
model = Sequential()

model.add(Embedding(max_features, 128))
model.add(LSTM(128, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer='adam', 
              metrics=['accuracy'])

model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding (Embedding)        (None, None, 128)         3965440   
_________________________________________________________________
lstm (LSTM)                  (None, 128)               131584    
_________________________________________________________________
dense (Dense)                (None, 1)                 129       
Total params: 4,097,153
Trainable params: 4,097,153
Non-trainable params: 0
_________________________________________________________________


In [15]:
alpacas = model.fit(X_train, y_train,
          batch_size=64, # Bigger batch sizes result in quicker training but less learning
          epochs=2, 
          validation_data=(X_test, y_test))

Train on 8982 samples, validate on 2246 samples
Epoch 1/2
Epoch 2/2


## Sequence Data Question
#### *Describe the `pad_sequences` method used on the training dataset. What does it do? Why do you need it?*

The pad_sequences method makes sure that all of the sequences are the same length. With this use the max length is set to find the longest sequence and then use a placeholder (like 0) in all of the sequences shorter than the maximum length. It's also possible to use pad_sequences to manually set a max length which will then truncate any sequences longer than the value you choose and make all of the longer sequences shorter. Everything needs to be the same shape so that matrix operations can take place without the program screaming about it.

## RNNs versus LSTMs
#### *What are the primary motivations behind using Long-ShortTerm Memory Cell unit over traditional Recurrent Neural Networks?*

RNNs and LSTMs are similar in architecture in that they have a long-term memory storage for things that they have seen in previous iterations. For text use they are good at recalling which words have been used a lot compared to those used more rarely. An LSTM is more suitable to a standard RNN, though, because an LSTM has the "short-term" part of the memory. An LSTM is great when the order of your data matters, say, in a sentence. In a sequence of data the LSTM has the ability to recall how a point of data fits in with previous data from the same sequence while a standard RNN can't. 

## RNN / LSTM Use Cases
#### *Name and Describe 3 Use Cases of LSTMs or RNNs and why they are suited to that use case*

An LSTM is useful when predicting text; like in the Shakespeare homework project. A standard RNN wouldn't be good at this because it would just put out words and that's nonsense compared to an LSTM learning where the words fit in the sentence. 

Since LSTMs care about sequence, they are also useful when composing music. For the same reason of wanting words in the sentence to be a particular order we wouldn't just want note vomit to listen to; we want the notes to make sense in a sequence.

A standard RNN can be used for speech recognition. It doesn't necessarily need to be concerned with the order that the words are spoken but can still accept the audio data and process it to break down what is being said. 


<a id="p2"></a>
## Part 2- CNNs

### Find the Frog

Time to play "find the frog!" Use Keras and ResNet50 (pre-trained) to detect which of the following images contain frogs:

<img align="left" src="https://d3i6fh83elv35t.cloudfront.net/newshour/app/uploads/2017/03/GettyImages-654745934-1024x687.jpg" width=400>

In [16]:
from skimage.io import imread_collection
from skimage.transform import resize #This might be a helpful function for you

images = imread_collection('./frog_images/*.jpg')

In [17]:
print(type(images))
print(type(images[0]), end="\n\n")

print("Each of the Images is a Different Size")
print(images[0].shape)
print(images[1].shape)

<class 'skimage.io.collection.ImageCollection'>
<class 'numpy.ndarray'>

Each of the Images is a Different Size
(2137, 1710, 3)
(3810, 2856, 3)


In [18]:
len(images)

15

In [22]:
for img in images:
    print(img.shape)

(2137, 1710, 3)
(3810, 2856, 3)
(3456, 4608, 3)
(2500, 3335, 3)
(2000, 3008, 3)
(2883, 4319, 3)
(4000, 6000, 3)
(2642, 3918, 3)
(3456, 5184, 3)
(2912, 4368, 3)
(4928, 3285, 3)
(3702, 5397, 3)
(1856, 2784, 3)
(2592, 3872, 3)
(2673, 3382, 3)


Your goal is to validly run ResNet50 on the input images - don't worry about tuning or improving the model. Print out the predictions in any way you see fit. 

*Hint* - ResNet 50 doesn't just return "frog". The three labels it has for frogs are: `bullfrog, tree frog, tailed frog`

*Stretch goal* - Check for other things such as fish.

In [39]:
images[0]

array([[[ 0,  0,  0],
        [ 0,  0,  0],
        [ 0,  0,  0],
        ...,
        [ 2,  2,  2],
        [ 2,  2,  2],
        [ 2,  2,  2]],

       [[ 0,  0,  0],
        [ 0,  0,  0],
        [ 0,  0,  0],
        ...,
        [ 2,  2,  2],
        [ 2,  2,  2],
        [ 2,  2,  2]],

       [[ 0,  0,  0],
        [ 0,  0,  0],
        [ 0,  0,  0],
        ...,
        [ 2,  2,  2],
        [ 2,  2,  2],
        [ 2,  2,  2]],

       ...,

       [[15, 16, 11],
        [14, 15, 10],
        [14, 15, 10],
        ...,
        [29, 29, 17],
        [28, 28, 16],
        [28, 28, 16]],

       [[14, 15, 10],
        [14, 15, 10],
        [14, 15, 10],
        ...,
        [29, 29, 17],
        [28, 28, 16],
        [28, 28, 16]],

       [[13, 14,  9],
        [13, 14,  9],
        [12, 13,  8],
        ...,
        [30, 30, 18],
        [30, 30, 18],
        [29, 29, 17]]], dtype=uint8)

In [41]:
# TODO - your code!
import numpy as np

from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions

def resize_img(img):
  return resize(img, (224, 224))

def process_img_path(img_path):
  return image.load_img(img_path, target_size=(224, 224))

def img_contains_frog(img):
  x = image.img_to_array(img)
  x = np.expand_dims(x, axis=0)
  x = preprocess_input(x)
  model = ResNet50(weights='imagenet')
  features = model.predict(x)
  results = decode_predictions(features, top=3)[0]
  print(results)
  for entry in results:
        if 'frog' in entry[1]:
            return entry[2]
  return 0.0

### This is my initial, terrible run.

In [33]:
for i in range(len(images)):
    print("Image: ", i)
    img_contains_frog(process_img_path(images[i]))
    print('='*40)

[('n06359193', 'web_site', 0.06232226), ('n03196217', 'digital_clock', 0.053684328), ('n01930112', 'nematode', 0.052816603)]
[('n03729826', 'matchstick', 0.0597493), ('n06359193', 'web_site', 0.055891547), ('n03196217', 'digital_clock', 0.047030497)]
[('n06359193', 'web_site', 0.0561313), ('n03729826', 'matchstick', 0.051397342), ('n03196217', 'digital_clock', 0.04931623)]
[('n06359193', 'web_site', 0.062960126), ('n01930112', 'nematode', 0.05261105), ('n03196217', 'digital_clock', 0.048840858)]
[('n06359193', 'web_site', 0.06479917), ('n01930112', 'nematode', 0.04962921), ('n03196217', 'digital_clock', 0.04514363)]
[('n06359193', 'web_site', 0.062426906), ('n03196217', 'digital_clock', 0.046223965), ('n01930112', 'nematode', 0.043731824)]
[('n06359193', 'web_site', 0.055798694), ('n03196217', 'digital_clock', 0.055427726), ('n03729826', 'matchstick', 0.05286291)]
[('n03729826', 'matchstick', 0.051141236), ('n06359193', 'web_site', 0.048578564), ('n03196217', 'digital_clock', 0.0478521

### This is an attempt to improve it.

In [56]:
images_array = ["frog_images/cristiane-teston-bcnfJvEYm1Y-unsplash.jpg", 
                "frog_images/drew-brown-VBvoy5gofWg-unsplash.jpg", 
                "frog_images/ed-van-duijn-S1zA6AR50X8-unsplash.jpg", 
                "frog_images/elizabeth-explores-JZybccsrB-0-unsplash.jpg", 
                "frog_images/jacky-watt-92W5jPbOj48-unsplash.jpg", 
                "frog_images/jared-evans-VgRnolD7OIw-unsplash.jpg", 
                "frog_images/joel-henry-Rcvf6-n1gc8-unsplash.jpg", 
                "frog_images/marcus-neto-fH_DOdTt-pA-unsplash.jpg", 
                "frog_images/matthew-kosloski-sYkr-M78H6w-unsplash.jpg", 
                "frog_images/mche-lee-j-P8z4EOgyQ-unsplash.jpg", 
                "frog_images/priscilla-du-preez-oWJcgqjFb6I-unsplash.jpg", 
                "frog_images/saturday_sun-_q37Ca0Ll4o-unsplash.jpg", 
                "frog_images/serenity-mitchell-tUDSHkd6rYQ-unsplash.jpg", 
                "frog_images/yanna-zissiadou-SV-aMgliWNs-unsplash.jpg", 
                "frog_images/zdenek-machacek-HYTwWSE5ztw-unsplash (1).jpg"]

In [57]:
for img in images_array:
    img_contains_frog(process_img_path(img))
    print('='*40)

[('n07718747', 'artichoke', 0.3059211), ('n02281787', 'lycaenid', 0.23762213), ('n07730033', 'cardoon', 0.18418735)]
[('n01641577', 'bullfrog', 0.99152386), ('n01667778', 'terrapin', 0.005188096), ('n02655020', 'puffer', 0.00051200367)]
[('n02281787', 'lycaenid', 0.28462553), ('n01641577', 'bullfrog', 0.069064215), ('n01737021', 'water_snake', 0.038429268)]
[('n04259630', 'sombrero', 0.4752294), ('n02877765', 'bottlecap', 0.08707738), ('n04409515', 'tennis_ball', 0.065153785)]
[('n03991062', 'pot', 0.61137015), ('n11939491', 'daisy', 0.11355704), ('n02840245', 'binder', 0.04898852)]
[('n01644373', 'tree_frog', 0.5634745), ('n01641577', 'bullfrog', 0.2976248), ('n01644900', 'tailed_frog', 0.13862132)]
[('n01644373', 'tree_frog', 0.9657679), ('n01644900', 'tailed_frog', 0.017379181), ('n02229544', 'cricket', 0.0054068756)]
[('n03991062', 'pot', 0.32278636), ('n04409515', 'tennis_ball', 0.30953994), ('n04275548', 'spider_web', 0.13213588)]
[('n01641577', 'bullfrog', 0.7582518), ('n0239852

### OK! That's more like it. There are definitely frogs in those predictions.

<a id="p3"></a>
## Part 3 - Autoencoders

Describe a use case for an autoencoder given that an autoencoder tries to predict its own input. 

__*Your Answer:*__ Apparently autoencoders are quite useful when you want to remove noise from some data. If you have a messy image you can pass it through a trained autoencoder and it will clean up the image ready for predictions that the autoencoder can then make. For example, you want to check an image for a number like in the MNIST dataset that we used this week. The autoencoder can take in somewhat sloppily written digits or images that at not very crisp, compress it down, then decode it and make a prediction about which digit it is.


<a id="p4"></a>
## Part 4 - More...

Answer the following questions, with a target audience of a fellow Data Scientist:

- What do you consider your strongest area, as a Data Scientist?

Honestly, I'm not sure. I recognize in myself a need to improve as there are many things I do not understand, yet. I want to improve in these areas, too. But the field is so exapansive that it's hard to decide where to focus my efforts. 

- What area of Data Science would you most like to learn more about, and why?

Neural networks and artifical intelligence is the area that is most appealing to me right now because it seems so mysterious. I haven't spent any time in reinforcement learning but I have been captivated by seeing advances in how AI is applied to games like Go and Starcraft 2. I enjoy seeing the machine learn different strategies that makes people step back and rethink what they once thought about the game.

- Where do you think Data Science will be in 5 years?

In a very similar boat to the one it is currently in; but maybe bigger. There's so much data being generated and it would be nice for there to be ways to understand that data. The field itself will likely continue growing rapidly and advances in AI will help get the job done. 

- What are the threats posed by AI to our society?

The biggest threats to our society from AI in my mind are automation and privacy concerns. As AI becomes more advacned it will make human implementation of some tasks just unnecessary and even too costly so those jobs will go to computers. This will leave people without jobs and I'm not sure our soceity is ready to provide for those people and help train them into another job. Privacy is also a concern as cameras and recorders are ubiquitous and it's not unimaginable that someone could take advantage of those and track a little too much information.

- How do you think we can counteract those threats? 

My favorite idea about dealing with automation is a universal basic income. It makes sense that if we can get the job done without people then we should be able to still provide the needs for those people. After all, decades ago the ideal was to use computers to do the job so that we, as a society, could spend more time improving ourselves as people. To counter the privacy concerns it might take laws to do so or maybe even counter AI programs to look for those taking advantage of others.

- Do you think achieving General Artifical Intelligence is ever possible?

Absolutely. Not soon, though. I don't see a definite reason why a program couldn't be written to take in data, process it, and adapt that data into new areas. This is precisely what we do as humans. Granted, we have many many generations of evolution working for us in this regard but I think we could get to the point where AI can abstract ideas from one case and use them in a different one.

A few sentences per answer is fine - only elaborate if time allows.

## Congratulations! 

Thank you for your hard work, and congratulations! You've learned a lot, and you should proudly call yourself a Data Scientist.


In [40]:
from IPython.display import HTML

HTML("""<iframe src="https://giphy.com/embed/26xivLqkv86uJzqWk" width="480" height="270" frameBorder="0" class="giphy-embed" allowFullScreen></iframe><p><a href="https://giphy.com/gifs/mumm-champagne-saber-26xivLqkv86uJzqWk">via GIPHY</a></p>""")

In [8]:
%pwd

'C:\\Users\\Neal\\Documents\\Lambda_Classnotes\\Unit_4_Sprint_3_Challenge'

In [40]:
# TODO - your code!
import numpy as np

from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions

# def resize_img(img):
#   return resize(img, (224, 224))

def process_img_path(img_path):
  return image.load_img(img_path, target_size=(224, 224))

def img_contains_frog(img):
  x = image.img_to_array(img)
  x = np.expand_dims(x, axis=0)
  x = preprocess_input(x)
  model = ResNet50(weights='imagenet')
  features = model.predict(x)
  results = decode_predictions(features, top=5)[0]
  print(results)
  prob = 0
  for entry in results:
    if 'frog' in entry[1]:
      prob += entry[2]
  if prob == 0:
    print("A frog was not predicted in the top five results.")
  else:
    print(f"The program predicts a {prob:.2f}% chance of the image containing a frog.")
#  return 0.0

In [41]:
img_contains_frog(process_img_path('frog_images/drew-brown-VBvoy5gofWg-unsplash.jpg'))

[('n01641577', 'bullfrog', 0.99152386), ('n01667778', 'terrapin', 0.005188096), ('n02655020', 'puffer', 0.00051200367), ('n02514041', 'barracouta', 0.00038378476), ('n03388043', 'fountain', 0.0002847035)]
The program predicts a 0.99% chance of the image containing a frog.


In [24]:
from os import listdir
path = "./frog_images/"

images = [f for f in listdir(path)]

images

['cristiane-teston-bcnfJvEYm1Y-unsplash.jpg',
 'drew-brown-VBvoy5gofWg-unsplash.jpg',
 'ed-van-duijn-S1zA6AR50X8-unsplash.jpg',
 'elizabeth-explores-JZybccsrB-0-unsplash.jpg',
 'jacky-watt-92W5jPbOj48-unsplash.jpg',
 'jared-evans-VgRnolD7OIw-unsplash.jpg',
 'joel-henry-Rcvf6-n1gc8-unsplash.jpg',
 'marcus-neto-fH_DOdTt-pA-unsplash.jpg',
 'matthew-kosloski-sYkr-M78H6w-unsplash.jpg',
 'mche-lee-j-P8z4EOgyQ-unsplash.jpg',
 'priscilla-du-preez-oWJcgqjFb6I-unsplash.jpg',
 'saturday_sun-_q37Ca0Ll4o-unsplash.jpg',
 'serenity-mitchell-tUDSHkd6rYQ-unsplash.jpg',
 'yanna-zissiadou-SV-aMgliWNs-unsplash.jpg',
 'zdenek-machacek-HYTwWSE5ztw-unsplash (1).jpg']

In [26]:
for i in range(len(images)):
    images[i] = 'frog_images/' + images[i]
    
images

['frog_images/cristiane-teston-bcnfJvEYm1Y-unsplash.jpg',
 'frog_images/drew-brown-VBvoy5gofWg-unsplash.jpg',
 'frog_images/ed-van-duijn-S1zA6AR50X8-unsplash.jpg',
 'frog_images/elizabeth-explores-JZybccsrB-0-unsplash.jpg',
 'frog_images/jacky-watt-92W5jPbOj48-unsplash.jpg',
 'frog_images/jared-evans-VgRnolD7OIw-unsplash.jpg',
 'frog_images/joel-henry-Rcvf6-n1gc8-unsplash.jpg',
 'frog_images/marcus-neto-fH_DOdTt-pA-unsplash.jpg',
 'frog_images/matthew-kosloski-sYkr-M78H6w-unsplash.jpg',
 'frog_images/mche-lee-j-P8z4EOgyQ-unsplash.jpg',
 'frog_images/priscilla-du-preez-oWJcgqjFb6I-unsplash.jpg',
 'frog_images/saturday_sun-_q37Ca0Ll4o-unsplash.jpg',
 'frog_images/serenity-mitchell-tUDSHkd6rYQ-unsplash.jpg',
 'frog_images/yanna-zissiadou-SV-aMgliWNs-unsplash.jpg',
 'frog_images/zdenek-machacek-HYTwWSE5ztw-unsplash (1).jpg']

In [42]:
for img in images:
    img_contains_frog(process_img_path(img))
    print('='*70)

[('n07718747', 'artichoke', 0.3059211), ('n02281787', 'lycaenid', 0.23762213), ('n07730033', 'cardoon', 0.18418735), ('n11939491', 'daisy', 0.047543373), ('n02280649', 'cabbage_butterfly', 0.03801479)]
A frog was not predicted in the top five results.
[('n01641577', 'bullfrog', 0.99152386), ('n01667778', 'terrapin', 0.005188096), ('n02655020', 'puffer', 0.00051200367), ('n02514041', 'barracouta', 0.00038378476), ('n03388043', 'fountain', 0.0002847035)]
The program predicts a 0.99% chance of the image containing a frog.
[('n02281787', 'lycaenid', 0.28462553), ('n01641577', 'bullfrog', 0.069064215), ('n01737021', 'water_snake', 0.038429268), ('n03314780', 'face_powder', 0.037996422), ('n01644900', 'tailed_frog', 0.03340858)]
The program predicts a 0.10% chance of the image containing a frog.
[('n04259630', 'sombrero', 0.4752294), ('n02877765', 'bottlecap', 0.08707738), ('n04409515', 'tennis_ball', 0.065153785), ('n03983396', 'pop_bottle', 0.045623694), ('n03249569', 'drum', 0.045290284)]

In [34]:
from os import listdir
path = "./frog_images/"

images = [('frog_images/' + f) for f in listdir(path)]

print(len(images))
images

15


['frog_images/cristiane-teston-bcnfJvEYm1Y-unsplash.jpg',
 'frog_images/drew-brown-VBvoy5gofWg-unsplash.jpg',
 'frog_images/ed-van-duijn-S1zA6AR50X8-unsplash.jpg',
 'frog_images/elizabeth-explores-JZybccsrB-0-unsplash.jpg',
 'frog_images/jacky-watt-92W5jPbOj48-unsplash.jpg',
 'frog_images/jared-evans-VgRnolD7OIw-unsplash.jpg',
 'frog_images/joel-henry-Rcvf6-n1gc8-unsplash.jpg',
 'frog_images/marcus-neto-fH_DOdTt-pA-unsplash.jpg',
 'frog_images/matthew-kosloski-sYkr-M78H6w-unsplash.jpg',
 'frog_images/mche-lee-j-P8z4EOgyQ-unsplash.jpg',
 'frog_images/priscilla-du-preez-oWJcgqjFb6I-unsplash.jpg',
 'frog_images/saturday_sun-_q37Ca0Ll4o-unsplash.jpg',
 'frog_images/serenity-mitchell-tUDSHkd6rYQ-unsplash.jpg',
 'frog_images/yanna-zissiadou-SV-aMgliWNs-unsplash.jpg',
 'frog_images/zdenek-machacek-HYTwWSE5ztw-unsplash (1).jpg']

In [44]:
from os import listdir

images = [('frog_images/' + f) for f in listdir("./frog_images/")]

print(len(images))
images

15


['frog_images/cristiane-teston-bcnfJvEYm1Y-unsplash.jpg',
 'frog_images/drew-brown-VBvoy5gofWg-unsplash.jpg',
 'frog_images/ed-van-duijn-S1zA6AR50X8-unsplash.jpg',
 'frog_images/elizabeth-explores-JZybccsrB-0-unsplash.jpg',
 'frog_images/jacky-watt-92W5jPbOj48-unsplash.jpg',
 'frog_images/jared-evans-VgRnolD7OIw-unsplash.jpg',
 'frog_images/joel-henry-Rcvf6-n1gc8-unsplash.jpg',
 'frog_images/marcus-neto-fH_DOdTt-pA-unsplash.jpg',
 'frog_images/matthew-kosloski-sYkr-M78H6w-unsplash.jpg',
 'frog_images/mche-lee-j-P8z4EOgyQ-unsplash.jpg',
 'frog_images/priscilla-du-preez-oWJcgqjFb6I-unsplash.jpg',
 'frog_images/saturday_sun-_q37Ca0Ll4o-unsplash.jpg',
 'frog_images/serenity-mitchell-tUDSHkd6rYQ-unsplash.jpg',
 'frog_images/yanna-zissiadou-SV-aMgliWNs-unsplash.jpg',
 'frog_images/zdenek-machacek-HYTwWSE5ztw-unsplash (1).jpg']