# Practice: 

In this we will do some work on the following topics:

* Dates and times with pandas
* Regular expressions with pandas
* Getting data from APIs with the `requests` library

The APIs that we are going to work with are the following:

* Position of the International Space Station (ISS)API
    * http://open-notify.org/Open-Notify-API/ISS-Location-Now/
    * For this API call, you just need to pass the URL and it will return the current position of the ISS.
* Kanye West quotes API
    * https://kanye.rest/
    * For this API call, you just need to pass the URL and it will return a random Kanye West quote.

### Exercise 1

Use the ISS API to get the current position of the ISS.

In [7]:
import requests

response = requests.get(url="http://api.open-notify.org/iss-now.json")
response_json = response.json()
response_json

{'message': 'success',
 'iss_position': {'latitude': '19.6180', 'longitude': '-13.9594'},
 'timestamp': 1738610638}

### Exercise 2

If you check the `timestamp` value in the response, you will see that it is in Unix time. The Unix timestamp represents the number of seconds that have passed since the Unix epoch time (January 1, 1970). Convert this to a timestamp in ISO format (YYYY-MM-DD HH:MM:SS).

You can do that using the [pd.to_datetime()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html) with the paramter `unit`. Choose the right unit to convert the Unix timestamp to a timestamp in ISO format.


In [8]:
response_json['timestamp']

1738610638

In [10]:
import datetime
import pandas as pd

In [21]:
response_json['timestamp'] = pd.to_datetime(response_json['timestamp'], unit='s')
response_json

{'message': 'success',
 'iss_position': {'latitude': '19.6180', 'longitude': '-13.9594'},
 'timestamp': Timestamp('2025-02-03 19:23:58')}

### Exercise 3

Using the [sleep function](https://docs.python.org/3/library/time.html#time.sleep) from the `time` library, write a function that prints the current datetime every 5 seconds. The function should stop after 10 iterations.

You can use the function `pd.Timestamp.now()` to get the current datetime at each iteration.

2025-01-28 11:39:47.045953
2025-01-28 11:39:52.051216
2025-01-28 11:39:57.056865
2025-01-28 11:40:02.062347
2025-01-28 11:40:07.067919
2025-01-28 11:40:12.070182
2025-01-28 11:40:17.075704
2025-01-28 11:40:22.080600
2025-01-28 11:40:27.084802
2025-01-28 11:40:32.087584


In [27]:
import time

for x in range(10):
    print(pd.Timestamp.now())
    time.sleep(5)

2025-02-03 20:36:35.973245
2025-02-03 20:36:40.974688
2025-02-03 20:36:45.975781
2025-02-03 20:36:50.976579
2025-02-03 20:36:55.977892
2025-02-03 20:37:00.979049
2025-02-03 20:37:05.979749
2025-02-03 20:37:10.980841
2025-02-03 20:37:15.981698
2025-02-03 20:37:20.982732


### Exercise 4

Create a function that receives the position of the ISS 10 times, using `sleep` to wait 5s between requests, and returns a list with the dictionaries from the responses.

In [29]:
def get_iss_position_10_times():
    import requests
    import time
    import pandas as pd
    import datetime
    
    for x in range(10):
        response = requests.get(url="http://api.open-notify.org/iss-now.json")
        response_json = response.json()
        response_json['timestamp'] = pd.to_datetime(response_json['timestamp'], unit='s')
        print(response_json)
        time.sleep(5)

get_iss_position_10_times()
    

{'message': 'success', 'iss_position': {'latitude': '-27.0436', 'longitude': '22.3076'}, 'timestamp': Timestamp('2025-02-03 19:39:39')}
{'message': 'success', 'iss_position': {'latitude': '-27.2752', 'longitude': '22.5401'}, 'timestamp': Timestamp('2025-02-03 19:39:44')}
{'message': 'success', 'iss_position': {'latitude': '-27.6680', 'longitude': '22.9378'}, 'timestamp': Timestamp('2025-02-03 19:39:53')}
{'message': 'success', 'iss_position': {'latitude': '-27.9214', 'longitude': '23.1967'}, 'timestamp': Timestamp('2025-02-03 19:39:58')}
{'message': 'success', 'iss_position': {'latitude': '-28.1745', 'longitude': '23.4573'}, 'timestamp': Timestamp('2025-02-03 19:40:04')}
{'message': 'success', 'iss_position': {'latitude': '-28.4268', 'longitude': '23.7190'}, 'timestamp': Timestamp('2025-02-03 19:40:09')}
{'message': 'success', 'iss_position': {'latitude': '-28.7241', 'longitude': '24.0300'}, 'timestamp': Timestamp('2025-02-03 19:40:16')}
{'message': 'success', 'iss_position': {'latitud

### Exercise 5

Create a DataFrame with the responses from the previous exercise. The DataFrame should have the following columns:

* `timestamp`: the timestamp of the response
* `latitude`: the latitude of the ISS
* `longitude`: the longitude of the ISS

Unnamed: 0,timestamp,latitude,longitude
0,2025-01-28 10:40:37,-49.3484,-121.6249
1,2025-01-28 10:40:42,-49.4549,-121.127
2,2025-01-28 10:40:47,-49.5497,-120.6724
3,2025-01-28 10:40:53,-49.6517,-120.1703
4,2025-01-28 10:40:59,-49.7691,-119.574
5,2025-01-28 10:41:06,-49.8832,-118.9748
6,2025-01-28 10:41:11,-49.9686,-118.5119
7,2025-01-28 10:41:16,-50.0602,-118.0009
8,2025-01-28 10:41:24,-50.1812,-117.3006
9,2025-01-28 10:41:29,-50.267,-116.7846


In [62]:
import requests
import time
import pandas as pd
import datetime

def get_iss_position_10_times():
    
    df=pd.DataFrame(columns=['timestamp', 'latitude', 'longitude'])

    for x in range(10):
        response = requests.get(url="http://api.open-notify.org/iss-now.json")
        response_json = response.json()
        response_json['timestamp'] = pd.to_datetime(response_json['timestamp'], unit='s')

        transform_response = {'timestamp': response_json['timestamp'], 'latitude': response_json['iss_position']['latitude'], 'longitude': response_json['iss_position']['longitude']}
        df = df._append(transform_response, ignore_index=True)
        time.sleep(5)
    
    return df



get_iss_position_10_times()

  df = df._append(transform_response, ignore_index=True)


ConnectTimeout: HTTPConnectionPool(host='api.open-notify.org', port=80): Max retries exceeded with url: /iss-now.json (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x0000026176A22F00>, 'Connection to api.open-notify.org timed out. (connect timeout=None)'))

In [41]:
response_json

{'message': 'success',
 'iss_position': {'latitude': '19.6180', 'longitude': '-13.9594'},
 'timestamp': Timestamp('2025-02-03 19:23:58')}

In [49]:
df1=pd.DataFrame(columns=['timestamp', 'latitude', 'longitude'])
transform_response = {'timestamp': response_json['timestamp'], 'latitude': response_json['iss_position']['latitude'], 'longitude': response_json['iss_position']['longitude']}

transform_response

{'timestamp': Timestamp('2025-02-03 19:23:58'),
 'latitude': '19.6180',
 'longitude': '-13.9594'}

In [56]:
df2 = pd.DataFrame(transform_response, index=[0])
df2

Unnamed: 0,timestamp,latitude,longitude
0,2025-02-03 19:23:58,19.618,-13.9594


In [59]:
df2 = df2._append(transform_response, ignore_index=True)
df2

Unnamed: 0,timestamp,latitude,longitude
0,2025-02-03 19:23:58,19.618,-13.9594
1,2025-02-03 19:23:58,19.618,-13.9594
2,2025-02-03 19:23:58,19.618,-13.9594


### Exercise 6

Read about the `diff` method in pandas and use it to calculate the differences between the timestamp of each request. Why is it not 1s?

In [63]:
data = {
    'timestamp': pd.to_datetime([
        '2025-01-28 10:40:37', '2025-01-28 10:40:42', '2025-01-28 10:40:47', '2025-01-28 10:40:53',
        '2025-01-28 10:40:59', '2025-01-28 10:41:06', '2025-01-28 10:41:11', '2025-01-28 10:41:16',
        '2025-01-28 10:41:24', '2025-01-28 10:41:29'
    ]),
    'latitude': [-49.3484, -49.4549, -49.5497, -49.6517, -49.7691, -49.8832, -49.9686, -50.0602, -50.1812, -50.2670],
    'longitude': [-121.6249, -121.1270, -120.6724, -120.1703, -119.5740, -118.9748, -118.5119, -118.0009, -117.3006, -116.7846]
    
}

df = pd.DataFrame(data)

df


Unnamed: 0,timestamp,latitude,longitude
0,2025-01-28 10:40:37,-49.3484,-121.6249
1,2025-01-28 10:40:42,-49.4549,-121.127
2,2025-01-28 10:40:47,-49.5497,-120.6724
3,2025-01-28 10:40:53,-49.6517,-120.1703
4,2025-01-28 10:40:59,-49.7691,-119.574
5,2025-01-28 10:41:06,-49.8832,-118.9748
6,2025-01-28 10:41:11,-49.9686,-118.5119
7,2025-01-28 10:41:16,-50.0602,-118.0009
8,2025-01-28 10:41:24,-50.1812,-117.3006
9,2025-01-28 10:41:29,-50.267,-116.7846


Unnamed: 0,timestamp,latitude,longitude,timestamp_diff
0,2025-01-28 10:40:37,-49.3484,-121.6249,
1,2025-01-28 10:40:42,-49.4549,-121.127,5.0
2,2025-01-28 10:40:47,-49.5497,-120.6724,5.0
3,2025-01-28 10:40:53,-49.6517,-120.1703,6.0
4,2025-01-28 10:40:59,-49.7691,-119.574,6.0
5,2025-01-28 10:41:06,-49.8832,-118.9748,7.0
6,2025-01-28 10:41:11,-49.9686,-118.5119,5.0
7,2025-01-28 10:41:16,-50.0602,-118.0009,5.0
8,2025-01-28 10:41:24,-50.1812,-117.3006,8.0
9,2025-01-28 10:41:29,-50.267,-116.7846,5.0


In [None]:
df['timestamp_diff'] = df['timestamp'].diff()


In [67]:
df

Unnamed: 0,timestamp,latitude,longitude,timestamp_diff
0,2025-01-28 10:40:37,-49.3484,-121.6249,NaT
1,2025-01-28 10:40:42,-49.4549,-121.127,0 days 00:00:05
2,2025-01-28 10:40:47,-49.5497,-120.6724,0 days 00:00:05
3,2025-01-28 10:40:53,-49.6517,-120.1703,0 days 00:00:06
4,2025-01-28 10:40:59,-49.7691,-119.574,0 days 00:00:06
5,2025-01-28 10:41:06,-49.8832,-118.9748,0 days 00:00:07
6,2025-01-28 10:41:11,-49.9686,-118.5119,0 days 00:00:05
7,2025-01-28 10:41:16,-50.0602,-118.0009,0 days 00:00:05
8,2025-01-28 10:41:24,-50.1812,-117.3006,0 days 00:00:08
9,2025-01-28 10:41:29,-50.267,-116.7846,0 days 00:00:05


### Exercise 7

I've change my mind and now we need a new column that contains tuples with the latitude and longitude of the ISS. Create this column.

Unnamed: 0,timestamp,latitude,longitude,timestamp_diff,position_tuple
0,2025-01-28 10:40:37,-49.3484,-121.6249,,"(-49.3484, -121.6249)"
1,2025-01-28 10:40:42,-49.4549,-121.127,5.0,"(-49.4549, -121.127)"
2,2025-01-28 10:40:47,-49.5497,-120.6724,5.0,"(-49.5497, -120.6724)"
3,2025-01-28 10:40:53,-49.6517,-120.1703,6.0,"(-49.6517, -120.1703)"
4,2025-01-28 10:40:59,-49.7691,-119.574,6.0,"(-49.7691, -119.574)"
5,2025-01-28 10:41:06,-49.8832,-118.9748,7.0,"(-49.8832, -118.9748)"
6,2025-01-28 10:41:11,-49.9686,-118.5119,5.0,"(-49.9686, -118.5119)"
7,2025-01-28 10:41:16,-50.0602,-118.0009,5.0,"(-50.0602, -118.0009)"
8,2025-01-28 10:41:24,-50.1812,-117.3006,8.0,"(-50.1812, -117.3006)"
9,2025-01-28 10:41:29,-50.267,-116.7846,5.0,"(-50.267, -116.7846)"


In [74]:
df['position_tuple'] = list(zip(df['latitude'], df['longitude']))
df

Unnamed: 0,timestamp,latitude,longitude,timestamp_diff,position_tuple
0,2025-01-28 10:40:37,-49.3484,-121.6249,NaT,"(-49.3484, -121.6249)"
1,2025-01-28 10:40:42,-49.4549,-121.127,0 days 00:00:05,"(-49.4549, -121.127)"
2,2025-01-28 10:40:47,-49.5497,-120.6724,0 days 00:00:05,"(-49.5497, -120.6724)"
3,2025-01-28 10:40:53,-49.6517,-120.1703,0 days 00:00:06,"(-49.6517, -120.1703)"
4,2025-01-28 10:40:59,-49.7691,-119.574,0 days 00:00:06,"(-49.7691, -119.574)"
5,2025-01-28 10:41:06,-49.8832,-118.9748,0 days 00:00:07,"(-49.8832, -118.9748)"
6,2025-01-28 10:41:11,-49.9686,-118.5119,0 days 00:00:05,"(-49.9686, -118.5119)"
7,2025-01-28 10:41:16,-50.0602,-118.0009,0 days 00:00:05,"(-50.0602, -118.0009)"
8,2025-01-28 10:41:24,-50.1812,-117.3006,0 days 00:00:08,"(-50.1812, -117.3006)"
9,2025-01-28 10:41:29,-50.267,-116.7846,0 days 00:00:05,"(-50.267, -116.7846)"


### Exercise 8

Take the column with the tuples, and zip it to itself in this way:

```python
df['new_column'] = list(zip(df['position'].shift(), df['position']))
```

Unnamed: 0,timestamp,latitude,longitude,timestamp_diff,position_tuple,pos_start_end
0,2025-01-28 10:40:37,-49.3484,-121.6249,,"(-49.3484, -121.6249)","(None, (-49.3484, -121.6249))"
1,2025-01-28 10:40:42,-49.4549,-121.127,5.0,"(-49.4549, -121.127)","((-49.3484, -121.6249), (-49.4549, -121.127))"
2,2025-01-28 10:40:47,-49.5497,-120.6724,5.0,"(-49.5497, -120.6724)","((-49.4549, -121.127), (-49.5497, -120.6724))"
3,2025-01-28 10:40:53,-49.6517,-120.1703,6.0,"(-49.6517, -120.1703)","((-49.5497, -120.6724), (-49.6517, -120.1703))"
4,2025-01-28 10:40:59,-49.7691,-119.574,6.0,"(-49.7691, -119.574)","((-49.6517, -120.1703), (-49.7691, -119.574))"
5,2025-01-28 10:41:06,-49.8832,-118.9748,7.0,"(-49.8832, -118.9748)","((-49.7691, -119.574), (-49.8832, -118.9748))"
6,2025-01-28 10:41:11,-49.9686,-118.5119,5.0,"(-49.9686, -118.5119)","((-49.8832, -118.9748), (-49.9686, -118.5119))"
7,2025-01-28 10:41:16,-50.0602,-118.0009,5.0,"(-50.0602, -118.0009)","((-49.9686, -118.5119), (-50.0602, -118.0009))"
8,2025-01-28 10:41:24,-50.1812,-117.3006,8.0,"(-50.1812, -117.3006)","((-50.0602, -118.0009), (-50.1812, -117.3006))"
9,2025-01-28 10:41:29,-50.267,-116.7846,5.0,"(-50.267, -116.7846)","((-50.1812, -117.3006), (-50.267, -116.7846))"


In [None]:
df['pos_start_end'] = list(zip(df['position_tuple'].shift(), df['position_tuple']))
df

Unnamed: 0,timestamp,latitude,longitude,timestamp_diff,position_tuple,pos_start_end
0,2025-01-28 10:40:37,-49.3484,-121.6249,NaT,"(-49.3484, -121.6249)","(None, (-49.3484, -121.6249))"
1,2025-01-28 10:40:42,-49.4549,-121.127,0 days 00:00:05,"(-49.4549, -121.127)","((-49.3484, -121.6249), (-49.4549, -121.127))"
2,2025-01-28 10:40:47,-49.5497,-120.6724,0 days 00:00:05,"(-49.5497, -120.6724)","((-49.4549, -121.127), (-49.5497, -120.6724))"
3,2025-01-28 10:40:53,-49.6517,-120.1703,0 days 00:00:06,"(-49.6517, -120.1703)","((-49.5497, -120.6724), (-49.6517, -120.1703))"
4,2025-01-28 10:40:59,-49.7691,-119.574,0 days 00:00:06,"(-49.7691, -119.574)","((-49.6517, -120.1703), (-49.7691, -119.574))"
5,2025-01-28 10:41:06,-49.8832,-118.9748,0 days 00:00:07,"(-49.8832, -118.9748)","((-49.7691, -119.574), (-49.8832, -118.9748))"
6,2025-01-28 10:41:11,-49.9686,-118.5119,0 days 00:00:05,"(-49.9686, -118.5119)","((-49.8832, -118.9748), (-49.9686, -118.5119))"
7,2025-01-28 10:41:16,-50.0602,-118.0009,0 days 00:00:05,"(-50.0602, -118.0009)","((-49.9686, -118.5119), (-50.0602, -118.0009))"
8,2025-01-28 10:41:24,-50.1812,-117.3006,0 days 00:00:08,"(-50.1812, -117.3006)","((-50.0602, -118.0009), (-50.1812, -117.3006))"
9,2025-01-28 10:41:29,-50.267,-116.7846,0 days 00:00:05,"(-50.267, -116.7846)","((-50.1812, -117.3006), (-50.267, -116.7846))"


### Exercise 9

Use the `haversine` [library](https://pypi.org/project/haversine/) with a lambda function on the column with the two positions you just calcualted, to calculate the distance between two points. How can you deal with the NaN values in the first row?

The usage of the haversine library is as follows:

```python
from haversine import haversine

coord1 = (52.2296756, 21.0122287) # (lat, lon)
coord2 = (52.406374, 16.9251681) # (lat, lon)

haversine(coord1, coord2) # distance in km
```

Now calcualte the speed of the ISS between two points. The speed should be stored in a new column in the DataFrame, as km/h.
$$speed = \frac{distance}{time}$$


Extra: If you want to calculate manually the distance between two points given their latitude and longitude, you can use the [haversine formula](https://en.wikipedia.org/wiki/Haversine_formula).

Unnamed: 0,timestamp,latitude,longitude,timestamp_diff,position_tuple,pos_start_end,distance,speed_kmh
0,2025-01-28 10:40:37,-49.3484,-121.6249,,"(-49.3484, -121.6249)","((-49.3484, -121.6249), (-49.3484, -121.6249))",0.0,
1,2025-01-28 10:40:42,-49.4549,-121.127,5.0,"(-49.4549, -121.127)","((-49.3484, -121.6249), (-49.4549, -121.127))",37.924522,27305.655643
2,2025-01-28 10:40:47,-49.5497,-120.6724,5.0,"(-49.5497, -120.6724)","((-49.4549, -121.127), (-49.5497, -120.6724))",34.478472,24824.499938
3,2025-01-28 10:40:53,-49.6517,-120.1703,6.0,"(-49.6517, -120.1703)","((-49.5497, -120.6724), (-49.6517, -120.1703))",37.920498,22752.29881
4,2025-01-28 10:40:59,-49.7691,-119.574,6.0,"(-49.7691, -119.574)","((-49.6517, -120.1703), (-49.7691, -119.574))",44.819711,26891.826756
5,2025-01-28 10:41:06,-49.8832,-118.9748,7.0,"(-49.8832, -118.9748)","((-49.7691, -119.574), (-49.8832, -118.9748))",44.815637,23048.041945
6,2025-01-28 10:41:11,-49.9686,-118.5119,5.0,"(-49.9686, -118.5119)","((-49.8832, -118.9748), (-49.9686, -118.5119))",34.470406,24818.692208
7,2025-01-28 10:41:16,-50.0602,-118.0009,5.0,"(-50.0602, -118.0009)","((-49.9686, -118.5119), (-50.0602, -118.0009))",37.906646,27292.784971
8,2025-01-28 10:41:24,-50.1812,-117.3006,8.0,"(-50.1812, -117.3006)","((-50.0602, -118.0009), (-50.1812, -117.3006))",51.708922,23269.014804
9,2025-01-28 10:41:29,-50.267,-116.7846,5.0,"(-50.267, -116.7846)","((-50.1812, -117.3006), (-50.267, -116.7846))",37.928249,27308.339109


In [91]:
!pip install haversine
import haversine as hs

Collecting haversine
  Downloading haversine-2.9.0-py2.py3-none-any.whl.metadata (5.8 kB)
Downloading haversine-2.9.0-py2.py3-none-any.whl (7.7 kB)
Installing collected packages: haversine
Successfully installed haversine-2.9.0


In [103]:
hs.haversine(df.loc[1,'pos_start_end'][0], df.loc[1,'pos_start_end'][1])


37.92452172704555

In [105]:
df['pos_start_end'].apply(lambda x: hs.haversine(df.loc[x,'pos_start_end'][0], df.loc[x,'pos_start_end'][1]))

AssertionError: 

In [108]:
df1 = df['pos_start_end'].apply(lambda x: x[1])
df1


0    (-49.3484, -121.6249)
1     (-49.4549, -121.127)
2    (-49.5497, -120.6724)
3    (-49.6517, -120.1703)
4     (-49.7691, -119.574)
5    (-49.8832, -118.9748)
6    (-49.9686, -118.5119)
7    (-50.0602, -118.0009)
8    (-50.1812, -117.3006)
9     (-50.267, -116.7846)
Name: pos_start_end, dtype: object

In [109]:
df['pos_start_end']

0                     (None, (-49.3484, -121.6249))
1     ((-49.3484, -121.6249), (-49.4549, -121.127))
2     ((-49.4549, -121.127), (-49.5497, -120.6724))
3    ((-49.5497, -120.6724), (-49.6517, -120.1703))
4     ((-49.6517, -120.1703), (-49.7691, -119.574))
5     ((-49.7691, -119.574), (-49.8832, -118.9748))
6    ((-49.8832, -118.9748), (-49.9686, -118.5119))
7    ((-49.9686, -118.5119), (-50.0602, -118.0009))
8    ((-50.0602, -118.0009), (-50.1812, -117.3006))
9     ((-50.1812, -117.3006), (-50.267, -116.7846))
Name: pos_start_end, dtype: object

In [116]:
df1 = df
df1


Unnamed: 0,timestamp,latitude,longitude,timestamp_diff,position_tuple,pos_start_end
0,2025-01-28 10:40:37,-49.3484,-121.6249,NaT,"(-49.3484, -121.6249)","(None, (-49.3484, -121.6249))"
1,2025-01-28 10:40:42,-49.4549,-121.127,0 days 00:00:05,"(-49.4549, -121.127)","((-49.3484, -121.6249), (-49.4549, -121.127))"
2,2025-01-28 10:40:47,-49.5497,-120.6724,0 days 00:00:05,"(-49.5497, -120.6724)","((-49.4549, -121.127), (-49.5497, -120.6724))"
3,2025-01-28 10:40:53,-49.6517,-120.1703,0 days 00:00:06,"(-49.6517, -120.1703)","((-49.5497, -120.6724), (-49.6517, -120.1703))"
4,2025-01-28 10:40:59,-49.7691,-119.574,0 days 00:00:06,"(-49.7691, -119.574)","((-49.6517, -120.1703), (-49.7691, -119.574))"
5,2025-01-28 10:41:06,-49.8832,-118.9748,0 days 00:00:07,"(-49.8832, -118.9748)","((-49.7691, -119.574), (-49.8832, -118.9748))"
6,2025-01-28 10:41:11,-49.9686,-118.5119,0 days 00:00:05,"(-49.9686, -118.5119)","((-49.8832, -118.9748), (-49.9686, -118.5119))"
7,2025-01-28 10:41:16,-50.0602,-118.0009,0 days 00:00:05,"(-50.0602, -118.0009)","((-49.9686, -118.5119), (-50.0602, -118.0009))"
8,2025-01-28 10:41:24,-50.1812,-117.3006,0 days 00:00:08,"(-50.1812, -117.3006)","((-50.0602, -118.0009), (-50.1812, -117.3006))"
9,2025-01-28 10:41:29,-50.267,-116.7846,0 days 00:00:05,"(-50.267, -116.7846)","((-50.1812, -117.3006), (-50.267, -116.7846))"


In [122]:
df1.loc[0,'pos_start_end'] = ((-49.3484, -121.6249),(-49.3484, -121.6249))
#df1

ValueError: Must have equal len keys and value when setting with an ndarray

### Exercise 10

Let's change APIs. Use the Kanye West API to get 10 quotes. Create a DataFrame with the quotes and the timestamp of the request.

In this API you don't get the timestamp. Build it yourself with the `pd.Timestamp.now()` function.

[{'quote': 'I am running for President of the United States',
  'timestamp': Timestamp('2025-01-28 11:49:36.254744')},
 {'quote': "Let's be like water",
  'timestamp': Timestamp('2025-01-28 11:49:41.341841')},
 {'quote': 'I am one of the most famous people on the planet',
  'timestamp': Timestamp('2025-01-28 11:49:46.416351')},
 {'quote': "People always say that you can't please everybody. I think that's a cop-out. Why not attempt it? Cause think of all the people that you will please if you try.",
  'timestamp': Timestamp('2025-01-28 11:49:51.491018')},
 {'quote': '2024', 'timestamp': Timestamp('2025-01-28 11:49:56.566302')},
 {'quote': 'Decentralize',
  'timestamp': Timestamp('2025-01-28 11:50:01.638714')},
 {'quote': 'I watch Bladerunner on repeat',
  'timestamp': Timestamp('2025-01-28 11:50:06.742456')},
 {'quote': "You can't look at a glass half full or empty if it's overflowing.",
  'timestamp': Timestamp('2025-01-28 11:50:11.800766')},
 {'quote': 'Sometimes you have to get rid o

In [130]:
response = requests.get(url="https://api.kanye.rest/")
data = response.json()
data

{'quote': 'We as a people will heal. We will insure the well being of each other'}

In [132]:
df = pd.DataFrame(columns=['quote', 'timestamp'])
df

Unnamed: 0,quote,timestamp


In [134]:
for x in range(10):
        response = requests.get(url="https://api.kanye.rest/")
        data = response.json()
        data_conv = {'quote': data['quote'], 'timestamp': pd.Timestamp.now()}
        df = df._append(data_conv, ignore_index=True)

df
        

  df = df._append(data_conv, ignore_index=True)


Unnamed: 0,quote,timestamp
0,2024,2025-02-03 22:26:52.790057
1,I love UZI. I be saying the same thing about S...,2025-02-03 22:26:52.844720
2,There's so many lonely emojis man,2025-02-03 22:26:52.881845
3,I feel calm but energized,2025-02-03 22:26:53.043116
4,Just stop lying about shit. Just stop lying.,2025-02-03 22:26:53.140001
5,Tweeting is legal and also therapeutic,2025-02-03 22:26:53.269053
6,Just stop lying about shit. Just stop lying.,2025-02-03 22:26:53.357286
7,There are people sleeping in parking lots,2025-02-03 22:26:53.420137
8,Truth is my goal. Controversy is my gym. I'll ...,2025-02-03 22:26:53.519968
9,The world is our family,2025-02-03 22:26:53.649722


### Exercise 11

Convert it into a Dataframe and, using regex and `findall` to count the words in each quote. Save it as a new column.

In [137]:
import regex as re



### Exercise 12

Create a new column that contains a boolean value that is True if the quote contains the word "I" and False otherwise.

Read about the `\b` regex pattern and use it.

Unnamed: 0,quote,timestamp,count_words,contains_I
0,I am running for President of the United States,2025-01-28 11:49:36.254744,9,True
1,Let's be like water,2025-01-28 11:49:41.341841,5,False
2,I am one of the most famous people on the planet,2025-01-28 11:49:46.416351,11,True
3,People always say that you can't please everyb...,2025-01-28 11:49:51.491018,33,True
4,2024,2025-01-28 11:49:56.566302,1,False
5,Decentralize,2025-01-28 11:50:01.638714,1,False
6,I watch Bladerunner on repeat,2025-01-28 11:50:06.742456,5,True
7,You can't look at a glass half full or empty i...,2025-01-28 11:50:11.800766,15,False
8,Sometimes you have to get rid of everything,2025-01-28 11:50:16.874773,8,False
9,I've known my mom since I was zero years old. ...,2025-01-28 11:50:21.984768,15,True
