#### streamlit should reload and work in a fast, interactive loop

- write code
- save
- review output in app
- continue

#### STEP 1

1. create new script (this one is called uber_pickups.py)
2. open example_uber_pickups.py in an IDE (integrative development environment) or text editor (*the file you are writing this in is the file that you need to open*)

In [1]:
#import the programs you need

import streamlit as st
import pandas as pd
import numpy as np

#to do all other basics as in matplotlib, use st.title('x')
st.title('Uber pickups in NYC')

2023-01-11 15:14:04.225 
  command:

    streamlit run C:\Users\achie\anaconda3\lib\site-packages\ipykernel_launcher.py [ARGUMENTS]


DeltaGenerator(_root_container=0, _provided_cursor=None, _parent=None, _block_type=None, _form_data=None)

In [2]:
#then run streamlit from command line
#OR copy and paste link streamlit provides you

#>streamlit run uber_pickups.py

#use this command to view app whenever you need
#you can also pass URLs to streamlit in this format
#esp helpful for github gists

- remember that you need to download the file as a .py before you can use it to create a streamlit application
- to do that, click the .ipynb file to open it, go to file, click download, and select the type of file you want to download it as

#### STEP 2 

1. get some data
2. test this app's function and review the output

In [3]:
DATE_COLUMN = 'date/time'
#you can add data by directly linking to it too
DATA_URL = ('https://s3-us-west-2.amazonaws.com/'
         'streamlit-demo-data/uber-raw-data-sep14.csv.gz')

#make function that reads the csv file
@st.cache
def load_data(nrows):
    data = pd.read_csv(DATA_URL, nrows=nrows)
    #make all strings lowercase
    lowercase = lambda x: str(x).lower()
    #rename the columns
    data.rename(lowercase, axis='columns', inplace=True)
    #change date column from text to datetime
    data[DATE_COLUMN] = pd.to_datetime(data[DATE_COLUMN])
    return data

In [4]:
# Create a text element and let the reader know the data is loading.
data_load_state = st.text('Loading data...')
# Load 10,000 rows of data into the dataframe.
data = load_data(10000)
# Notify the reader that the data was successfully loaded.
data_load_state.text("Done! (using st.cache)")

InternalHashError: module '__main__' has no attribute '__file__'

While caching the body of `load_data()`, Streamlit encountered an
object of type `builtins.function`, which it does not know how to hash.

**In this specific case, it's very likely you found a Streamlit bug so please
[file a bug report here.]
(https://github.com/streamlit/streamlit/issues/new/choose)**

In the meantime, you can try bypassing this error by registering a custom
hash function via the `hash_funcs` keyword in @st.cache(). For example:

```
@st.cache(hash_funcs={builtins.function: my_hash_func})
def my_func(...):
    ...
```

If you don't know where the object of type `builtins.function` is coming
from, try looking at the hash chain below for an object that you do recognize,
then pass that to `hash_funcs` instead:

```
Object of type builtins.function: <function load_data at 0x0000017CA3EF9790>
```

Please see the `hash_funcs` [documentation](https://docs.streamlit.io/library/advanced-features/caching#the-hash_funcs-parameter)
for more details.
            

#### STEP 3

1. loading a lot of data takes a long time, and each time you use 'r' to refresh or convert text to datetime it's also a slow process
2. you can cache data on streamlit (done before the conversion lines in In [5] above
3. inspect raw data using st.subheader and st.write

In [None]:
if st.checkbox('Show raw data'):
    st.subheader('Raw data')
    st.write(data)
#st.write renders anything passed through it, so if you pass a table
#it makes it an interactive table

st.subheader('Number of pickups by hour')

In [None]:
#use numpy to make a histogram breaking down pickup times in bins by hour
hist_values = np.histogram(
    data[DATE_COLUMN].dt.hour, bins=24, range=(0,24))[0]

In [5]:
#use streamlits method to make histogram
st.bar_chart(hist_values)

NameError: name 'hist_values' is not defined

#### How does @st.cache work?

- When you mark a function with Streamlit’s cache annotation, it tells Streamlit that whenever the function is called that it should check three things:

1. The actual bytecode that makes up the body of the function
2. Code, variables, and files that the function depends on
3. The input parameters that you called the function with

- If this is the first time Streamlit has seen these items, with these exact values, and in this exact combination, it runs the function and stores the result in a local cache
- The next time the function is called, if the three values haven't changed, then Streamlit knows it can skip executing the function altogether. Instead, it reads the output from the local cache and passes it on to the caller

#### Limitations:

- Streamlit will only check for changes within the current working directory. If you upgrade a Python library, Streamlit's cache will only notice this if that library is installed inside your working directory.
- If your function is not deterministic (that is, its output depends on random numbers), or if it pulls data from an external time-varying source (for example, a live stock market ticker service) the cached value will be none-the-wiser.
- Lastly, you should not mutate the output of a cached function since cached values are stored by reference (for performance reasons and to be able to support libraries such as TensorFlow). Note that, here, Streamlit is smart enough to detect these mutations and show a loud warning explaining how to fix the problem.

#### STEP 4
1. plot your data on a map to figure out where pickups are concentrated throughout the city
2. you can stick with a bar chart to show this data but it would not be easy for people using the app to interpret
3. map is fully interactive

In [7]:
#st.subheader('Map of all pickups')
#plot data on the map using st.map()
#st.map(data)

NameError: name 'data' is not defined

4. the busiest time shown on the histogram was 5 PM
5. redraw the map to show the concentration of pickups at 17:00 (due to datetime conversion)
6. replace the previous code

In [8]:
hour_to_filter = st.slider('hour', 0, 23, 17)  # min: 0h, max: 23h, default: 17h
filtered_data = data[data[DATE_COLUMN].dt.hour == hour_to_filter]
st.subheader(f'Map of all pickups at {hour_to_filter}:00')
st.map(filtered_data)

NameError: name 'data' is not defined

#### STEP 5

1. filter results with a slider so that the user can filter results in real time
2. replace hour_to_filter 

#### STEP 6
1. use checkboxes to toggle data and show/hide raw data
2. replace st.subheader('Raw data') with loop (in STEP 3)