In [171]:
# Reference 1: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/programming-with-python.html

# ^^^ General info about using boto3 and DynamoDB in Python

# Reference 2: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-query-scan.html

# ^^^ Why the 'scan' method can be slow

# Reference 3: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dynamodb/table/scan.html

# ^^^ Detailed info about the 'scan' method using the 'resource' interface

# Reference 4: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dynamodb/client/query.html

# ^^^ Detailed info about the 'query' method using the 'resource' interface

# Reference 5: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/customizations/dynamodb.html#ref-dynamodb-conditions

# ^^^ List of valid conditions for Key() and Attr()

# Reference 6: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ReadConsistency.html

# ^^^ Info about ConsistentRead 

In [1]:
import boto3 

# ^^^ Imports the Amazon Web Services (AWS) software development kit (SDK) for Python

# Reference 1

from boto3.dynamodb.conditions import Key, Attr

import pandas as pd

from pandasgui import show

# ^^^ Used for visualizing Pandas dataframes quickly

from IPython.display import clear_output

In [2]:
dynamodb = boto3.resource( 'dynamodb' )

# ^^^ Initializes a higher-level interface (called 'resource') that is able to access most of the tools in DynamoDB 

# There is a another lower-level interface called 'client' that can access more of DynamoDB's tools compared to 'resource', but it
# is more complicated to use with seemingly no tangible benefit (so far), so I did not use it. If one wishes to try using it, 
# run: 'dynamodb = boto3.client('dynamodb')'

# Reference 1

In [3]:
adspp_table = dynamodb.Table( 'ads_passenger_processed' )

# ^^^ Accesses the 'ads_passenger_processed' table from DriveOhio in the DynamoDB database. This is one of the two tables 
# available to us (those being 'ads_passenger_processed' and 'ads_passenger_processed_metadata').

In [4]:
# Before we do a query on the table 'adspp_table', let us look at a small subset of the table to get a feel for its structure

scan_output = adspp_table.scan()

# ^^^ The line above returns the first 1 MB of data in the table

# ^^^ The 'scan' method above is similar to the 'query' method we will use later, but is usually much slower when requesting data 
# from a table, especially if the table is large. The reason behind this, in short, is that the 'scan' method iterates through
# the entirity of a table, then picks out the rows you requested using the filters afterwards. The 'query' method on the other 
# hand iterates through a primary key or index key in the table that matches a certain value, then applies any more filters
# you specified afterwards. Depending on what primary or index key condition you choose, the 'query' method may only have
# to iterate over a small subset of the table, potentially making it a lot faster than 'scan'. For small tables, 'scan' is
# still fine to use and may be more versatile and simple, hence why I used it above.

# Reference 2

# One last comment, both the 'scan' and 'query' methods return a maximum of 1 MB of data from one usage. For tables larger than
# 1 MB, one has perform multiple scans or queries, each starting where the last scan or query left off, in order to access all of
# the table. Additionally, one needs to remember to save the data pulled out of the table after each query in another object, such
# as a Pandas dataframe, else it will be lost.

# References 3 and 4

In [6]:
# The output of the 'scan' method is a dictionary containing items which are useful to understand (the output is
# extremely similar to a 'query' output)

print( dict.keys( scan_output ) )

# ^^^ This line simply prints the keys of the scan output dictionary

dict_keys(['Items', 'Count', 'ScannedCount', 'LastEvaluatedKey', 'ResponseMetadata'])


In [7]:
# The 'Items' key is the most important key, as this is where the items/rows we requested from the table are contained. These
# items/rows are stored in a list. Each item/row in this list is a dictionary, with its keys being the attributes/columns in
# that item/row and the values being the values associated with the attributes/columns in that item/row.

# Let's take a look at the attributes/columns in the first item/row of the table:

key_lens = [ len( key ) for key in scan_output[ 'Items' ][ 0 ].keys() ]
max_key_len = max( key_lens )

print('Attribute' + ' '*12 + 'Value \n')
for key, val in scan_output[ 'Items' ][ 0 ].items():

    space = max_key_len - len( key )

    print( str( key ) + ' ' * ( space + 3 ) + str( val ) )

# ^^^ The code above displays the values of the attributes/columns in the first item/row of the table. It looks a bit complicated
# just because I wanted to display it nicely. If one wants the raw dictionary, simply do print( can_output[ 'Items' ][ 0 ] ),
# which I'll do below (it is ugly)

# References 3 and 4

Attribute            Value 

yawRate              0
drivingMode          COMPLETE_AUTO_DRIVE
topic                /apollo/canbus/chassis
msgsize              133
time                 1701972580600806943
gearLocation         GEAR_DRIVE
metadataID           e3362dcd-f0e0-11ee-ba1e-fb353e7798cd
throttlePercentage   11.6
steeringPercentage   0.74436826
brakePercentage      0
engageAdvice         {'advice': 'READY_TO_ENGAGE'}
groupMetadataID      1f70a4f0-f0e0-11ee-ba1e-fb353e7798cd
errorCode            NO_ERROR
signal               {'turnSignal': 'TURN_NONE'}
steeringRate         0
header               {'moduleName': 'canbus', 'sequenceNum': Decimal('11358'), 'timestampSec': Decimal('1701972580.6002243')}
speedMps             19.71
engineStarted        True
_id                  ed569d6e-f0e0-11ee-ba1e-fb353e7798cd
fuelRangeM           0
msg_type             
wheelSpeed           {'isWheelSpdRrValid': True, 'wheelSpdRr': Decimal('57.4'), 'isWheelSpdRlValid': True, 'wheelSpdFr': Decimal('57.

In [8]:
print( scan_output[ 'Items' ][ 0 ] )

# ^^^ What the first row actually looks like as a dictionary

{'yawRate': Decimal('0'), 'drivingMode': 'COMPLETE_AUTO_DRIVE', 'topic': '/apollo/canbus/chassis', 'msgsize': Decimal('133'), 'time': Decimal('1701972580600806943'), 'gearLocation': 'GEAR_DRIVE', 'metadataID': 'e3362dcd-f0e0-11ee-ba1e-fb353e7798cd', 'throttlePercentage': Decimal('11.6'), 'steeringPercentage': Decimal('0.74436826'), 'brakePercentage': Decimal('0'), 'engageAdvice': {'advice': 'READY_TO_ENGAGE'}, 'groupMetadataID': '1f70a4f0-f0e0-11ee-ba1e-fb353e7798cd', 'errorCode': 'NO_ERROR', 'signal': {'turnSignal': 'TURN_NONE'}, 'steeringRate': Decimal('0'), 'header': {'moduleName': 'canbus', 'sequenceNum': Decimal('11358'), 'timestampSec': Decimal('1701972580.6002243')}, 'speedMps': Decimal('19.71'), 'engineStarted': True, '_id': 'ed569d6e-f0e0-11ee-ba1e-fb353e7798cd', 'fuelRangeM': Decimal('0'), 'msg_type': '', 'wheelSpeed': {'isWheelSpdRrValid': True, 'wheelSpdRr': Decimal('57.4'), 'isWheelSpdRlValid': True, 'wheelSpdFr': Decimal('57.59'), 'isWheelSpdFrValid': True, 'wheelSpdRl': 

In [9]:
# The 'Count' and 'ScannedCount' keys contain integers representing the number of rows returned before and after applying the 
# filters one specificied in the 'scan' method. It's not too important to pay attention to these, although if the ratio
# between the 'ScannedCount' and 'Count' numbers is particularly small, it indicates the scan was inefficient.

print( 'Count: ', scan_output[ 'Count' ] )
print( 'ScannedCount: ', scan_output[ 'ScannedCount' ] )

# ^^^ One can see the 'ScannedCount' and 'Count' numbers are the same because we did not set any filters in our scan

# References 3 and 4

Count:  1324
ScannedCount:  1324


In [10]:
# The 'LastEvaluatedKey' key is an important one. Because the 'scan' and 'query' methods only return a maximum of 1 MB of 
# data from a table, if a table has more than 1 MB of data, we need a way to do another 'scan' or 'query' starting at the 
# item/row where the previous one stopped. The 'LastEvaluatedKey' key contains the primary key of the item/row a 'scan' 
# or 'query' method stopped, and if put as an argument in another 'scan' or 'query' method, will start that 'scan' or
# 'query' at that item/row.

# If the LastEvaluatedKey' key is absent from a 'scan' or 'query' output, that means the scan' or 'query' has reached the
# end of the table.

print( scan_output[ 'LastEvaluatedKey' ] )

# ^^^ By looking at the 'LastEvaluatedKey', we can see that the primary key of the 'ads_passenger_processed' table is a 
# combination of the '_id' and 'time' attributes/columns. In DynamoDB, '_id' is what would be refered to as the 'partition key' and
# 'time' as the 'sort key'.

# References 3 and 4

{'_id': '628a6707-f1a1-11ee-bac2-fb353e7798cd', 'time': Decimal('1706040720012172773')}


In [11]:
# I am not entirely sure on what the 'ResponseMetadata' does yet, I think it is related to error handling. For our purposes,
# it does not seem important.

print( scan_output[ 'ResponseMetadata' ] )

{'RequestId': 'BA3A1OOPV54MOK5G6EOOE3AKDRVV4KQNSO5AEMVJF66Q9ASUAAJG', 'HTTPStatusCode': 200, 'HTTPHeaders': {'server': 'Server', 'date': 'Wed, 11 Sep 2024 19:03:35 GMT', 'content-type': 'application/x-amz-json-1.0', 'content-length': '1831761', 'connection': 'keep-alive', 'x-amzn-requestid': 'BA3A1OOPV54MOK5G6EOOE3AKDRVV4KQNSO5AEMVJF66Q9ASUAAJG', 'x-amz-crc32': '399128536'}, 'RetryAttempts': 0}


In [12]:
# Now that we have looked at what the output of a 'scan' and 'query' contains, let's take a look at the table beyond just the first
# row, just as another step to get a feel for its structure. We'll do this by using Pandas

scan_output_df = pd.DataFrame.from_dict( pd.json_normalize( scan_output[ 'Items' ] ), orient = 'columns' )

# ^^^ This line converts the outputted items/rows in the 'Items' key into a Pandas dataframe

In [13]:
# Now, let's just show it

show( scan_output_df )

# What we can notice is that there are a lot of attributes/columns, and a lot of items/rows have missing data for these 
# attributes/columns. The value present in the 'topic' attributes/columns seems to be associated with which attributes/columns
# are filled, so, although '_id' and 'time' might be the primary key, 'topic' might be a good attribute/column to index over
# to extract a certain group of attributes/columns.

PandasGUI INFO — pandasgui.gui — Opening PandasGUI
  show( scan_output_df )
  show( scan_output_df )

Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`


Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`


Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`


Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`



<pandasgui.gui.PandasGui at 0x7c8731e1a200>

In [14]:
# Suppose we want to find the values for the 'time' and 'throttlePercentage' attributes in rows containing the value
# 1f70a4f0-f0e0-11ee-ba1e-fb353e7798cd for the 'groupMetadataID' attribute. We can do this using the 'query' method.
# Notice that the 'throttlePercentage' attribute only has values for rows with the value /apollo/canbus/chassis for
# the 'topic' attribute, so we should index over where 'topic' is equal to /apollo/canbus/chassis to be efficient.
# We can perform this task without having to iterate over the entire table, which would take a very long time, by
# choosing a good attribute to index over.

# In order to do this query, we need to define the parameters that will go into it. The query below contains all the
# non-legacy parameters that the 'query' method in DyanmoDB can take. Some are very important, some are not necessary.
# I will describe each one, but in case you run into an error in the future or wants a thourough explanation, I recommend
# looking at Reference 4, as it contains a lot of useful info.

# Reference 4

query_input = dict(
                   IndexName = 'topic-index', # !!! If one wants to index over an atrribute in a table that is NOT the primary key,
                                              # one needs to specify that attribute here. Typically, one would write this
                                              # parameter as the name of the attribute one wants to index over + '-index'
                                              # If omitted, the query will index over the primary key. (a string)

                                              # Notice, because we want to index over 'topic' for our current query, it is set to
                                              # 'topic-index'.

                   # Select = # Much of what this parameter does is done implicitly and is not necessary. If one does encounter a
                              # scenario in which it is needed, please refer to the documentation in Reference 4. (a string)
                              # (Essentially this parameter specifies whether all, some, or no attributes are to be outputted)
    
                   Limit = 2000, # This parameter limits the number of items/rows that can be iterated over to less than or
                                 # equal to what it is set to. Two scenarios in which it will not iterate over the maximum number 
                                 # of rows specified here are as follows: the table does not have that many rows left or the 1 MB limit 
                                 # for the query as been reached. This parameter is not necessary. (an integer)
    
                   # ConsistentRead = # A parameter that determines whether the data in the rows being iterated over and pulled out
                                      # is fully up to date with the latest insertions, updates, or deletions. Not necessary to specify
                                      # unless the table is being modified frequently. (a boolean)

                                      # Reference 6
    
                   # ScanIndexForward = # A parameter that determines whether the table is traversed in ascending or 
                                        # descending order in the context of the sort key. Default is true. Likely not 
                                        # necessary to specify. (a boolean)
    
                   # ExclusiveStartKey = # !!! A parameter to specify what item/row the query should start at. Specifically, this is the 
                                         # primary key of the first item/row the query will start at. This parameter is extremely 
                                         # important when one wants to query a table larger than 1 MB, as a single query will not iterate
                                         # over all of it. The 'LastEvaluatedKey' of the output of a previous query is usually put here
                                         # to continue where it left off. (a dict)

                                         # If omitted, the query will start at the top or bottom row in the context of the sort key.
    
                   # ReturnConsumedCapacity = # A parameter which adds a metric indicating the amount of data(?) consumed by the query to the
                                              # output of the query. Not necessary. (a string)

                   ProjectionExpression = '#time, throttlePercentage',

                   # !!! ^^^ A parameter that determines what attributes will be retrtieved by the query. If one wants all 
                   # available attributes in each item/row, this parameter should be omitted. Otherwise, one should write
                   # the desired attributes in a comma seperated list as a string. Ex. 'attr1, attr2, attr3, attr4'. (a string)

                   # Notice, because we want to find the values of the 'time' and 'throttlePercentage' attributes for our current
                   # query, these attributes have been listed. However, you might notice that the 'time' attribute has a # attached
                   # to its front. This is because the word 'time' is actually a reserved word in DynamoDB. If an attribute is a
                   # a reserved word or has special characters that cause it to be misinterpreted, it needs to be assigned an alias
                   # to be used in expressions using the ExpressionAttributeNames key further below.
    
                   FilterExpression = Attr( 'groupMetadataID' ).eq( '1f70a4f0-f0e0-11ee-ba1e-fb353e7798cd' ),

                   # !!! ^^^ A parameter containing conditions attributes in each item/row must satisfy in order for that item/row to
                   # be in the query output. The conditions are written as Attr() taking the name of an attribute one would like to
                   # filter, then after the parenthesis, a period and the type of condition the attribute must satisfy. If one does
                   # not want to add any extra conditions aside from the KeyConditionExpression, this parameter may be omitted.

                   # ^^^ Correction: the attributes in the FilterExpression must not include the attribute being indexed over (in 
                   # this case, 'topic' should not be an attribute in FilterExpression)

                   # General Form: Attr( attribute_name ).condition()

                   # There are many conditions one may apply to an Attr( attribute_name ), such as .eq( value ) [checking if an 
                   # attribute is equal to a certain value], .ne( value ) [not equal to a value], .between( low_value, high_value )
                   # [between two values], etc. A full list of conditions that can act on Attr() can be found in Reference 5.

                   # In the FilterExpression parameter, multiple conditions may be chained together using the & symbol.
                   # Ex. Attr( attribute_name1 ).condition1() & Attr( attribute_name2 ).condition2()

                   # For the current query, the FilterExpression checks if the 'groupMetadataID' attribute is equal to
                   # '1f70a4f0-f0e0-11ee-ba1e-fb353e7798cd'.

                   # Reference 5

                   KeyConditionExpression = Key( 'topic' ).eq( '/apollo/canbus/chassis' ),

                   # !!! ^^^ A parameter containing conditions the attribute being indexed over in each item/row must satisfy in
                   # order for that item/row to be iterated over and potentially sent to the output. The reason that the rows
                   # iterated over are only potentially sent to the output is that they must also pass the filtering in the
                   # FilterExpression before being sent to the output. Essentially, the KeyConditionExpression parameter picks
                   # the rows out of the table satisfying the conditions for the attribute being indexed over, then, only after
                   # collecting all them, sends them to the FilterExpression, which then sends the filtered rows to the output.

                   # ^^^ If IndexName is omitted, the attribute(s) being indexed over is the primary key. The primary key
                   # may consist of a partition key, or a partition key and a sort key. In this case, the KeyConditionExpression
                   # must consist of a condition for the partition key or the partition key and the sort key, but not the sort
                   # key alone.

                   # The KeyConditionExpression parameter is a required part of a query. It must consist of a condition for 
                   # the partition key, the partition key and the sort key, or the attribute being indexed over.

                   # General Form: Key( indexed_attribute_name ).condition()

                   # There are many conditions one may apply to an Key( indexed_attribute_name ), but it is less than what is 
                   # available for Attr( attribute_name ). A full list of conditions that can act on Key() can be found in 
                   # Reference 5.

                   # For the current query, because we want to index over where 'topic' is equal to /apollo/canbus/chassis, the
                   # KeyConditionExpression checks exactly that.

                   # Reference 5

                   #-----------------------------------------------

                   ExpressionAttributeNames = {
                                               '#time' : 'time'
                                              },   
                   
                   # !!! ^^^ As mentioned in the description for ProjectionExpression, this parameter is where aliases for 
                   # attributes that are reserved words in DynamoDB or have special characters that cause them to be 
                   # misinterpreted need to be defined in order to be used in expressions. (a dict)

                   # This is done by creating a dictionary where the keys are the aliases to be used in expressions (the
                   # aliases need to start with a #) and the values are the original attributes that have issues.

                   # (It can alse be used to assign aliases to attributes that have no issues, but are just cumbersome to
                   # write and you want to use them multiple times in expressions)

                   #------------------------------------------------

                   # ExpressionAttributeValues = 
    
                   # !!! ^^^ This parameter serves a very similar purpose to ExpressionAttributeNames,
                   # although instead of being used to assign aliases to attributes, it is used
                   # to assign aliases to values, such as those in the table, that you want to
                   # use/check for in expressions. This may be necessary because of reserved words
                   # or special characters, or simply for convenience. (a dict)

                   # I cannot seem to get any examples of this working, so I am unsure how to use it
                   # right now
                  
                  )

In [15]:
# Because the previous cell has a lot of comments, let us summarize the goal of our query and write the query without the comments.

# We wants to find the values for the 'time' and 'throttlePercentage' attributes for rows containing the 'groupMetadataID' 
# attribute value 1f70a4f0-f0e0-11ee-ba1e-fb353e7798cd by indexing over where the 'topic' attribute value is equal to 
# /apollo/canbus/chassis.

query_input = dict(
                   IndexName = 'topic-index',
    
                   Limit = 2000,

                   ProjectionExpression = '#time, throttlePercentage',
    
                   FilterExpression = Attr( 'groupMetadataID' ).eq( '1f70a4f0-f0e0-11ee-ba1e-fb353e7798cd' ),

                   KeyConditionExpression = Key( 'topic' ).eq( '/apollo/canbus/chassis' ),

                   ExpressionAttributeNames = {
                                               '#time' : 'time'
                                              },
                  )

In [16]:
# Putting this dictionary of query parameters into a 'query' method for the table 'adspp_table'

query_output = adspp_table.query( **query_input )

In [17]:
# We receive 2 items/rows out of 1584 items/rows queried that have '/apollo/canbus/chassis' as a topic
# and have their 'groupMetadataID' equal to '1f70a4f0-f0e0-11ee-ba1e-fb353e7798cd'

# It is worth noting that the 'LastEvaluatedKey' key is present in the output, indicating that the query has not
# iterated through the whole 'adspp_table'. We must keep keep querying the table with the same query parameters,
# except with the 'ExclusiveStartKey' equal to the 'LastEvaluatedKey' key, until the 'LastEvaluatedKey' eventually
# disappears from the output. Only then will our query be truly finished.

print( query_output[ 'Items' ], '\n' )

print( query_output[ 'ScannedCount' ], '\n' )

print( query_output.keys() )

[{'throttlePercentage': Decimal('8.7'), 'time': Decimal('1701972622544006277')}, {'throttlePercentage': Decimal('26.6'), 'time': Decimal('1701972567168325460')}] 

1584 

dict_keys(['Items', 'Count', 'ScannedCount', 'LastEvaluatedKey', 'ResponseMetadata'])


In [18]:
# But first, let us save our results so far in a Pandas dataframe

query_output_df = pd.DataFrame.from_dict( pd.json_normalize( query_output[ 'Items' ] ), orient = 'columns' )

In [19]:
# Let's take a quick look at it

show( query_output_df )

PandasGUI INFO — pandasgui.gui — Opening PandasGUI

Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`


Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`


Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`


Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`


Series.__getitem__ treating keys as positions is deprecated. In a future

<pandasgui.gui.PandasGui at 0x7c8701ab3910>

In [20]:
# Okay, now let's query the rest of the table, appending the results of our queries to the Pandas dataframe
# we just made as we loop through the table. (Actually, we won't loop through the whole table, it will take
# a long time, let's just do it 50 times for demonstration purposes)

i = 0 # Initializes a counting variable to check if the while loop has looped 50 times
while ( 'LastEvaluatedKey' in query_output.keys() ) and ( i <= 50 ):

    # ^^^ A while loop that will continue looping as long as the 'LastEvaluatedKey' key still exists in
    # the query outputs and the counting variable is less than or equal to 10

    query_input[ 'ExclusiveStartKey' ] = query_output[ 'LastEvaluatedKey' ]

    # ^^^ Adds/updates a key in the query input parameters we defined earlier to tell a new query to
    # start at the item the previous query stopped at.

    query_output = adspp_table.query( **query_input )

    # ^^^ Runs the new query

    temp_df = pd.DataFrame.from_dict( pd.json_normalize( query_output[ 'Items' ] ), orient = 'columns' )

    # ^^^ Saves the new query results in a temporary dataframe

    query_output_df = pd.concat( [ query_output_df, temp_df ] )

    # ^^^ Appends the temporary dataframe to the 'query_output_df' defined earlier

    i = i + 1

    # Updates the counting variable

In [21]:
# Let's take a quick look at the dataframe again to see how many items we gained

show( query_output_df )

PandasGUI INFO — pandasgui.gui — Opening PandasGUI

Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`


Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`


Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`


Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`


Series.__getitem__ treating keys as positions is deprecated. In a future

<pandasgui.gui.PandasGui at 0x7c8701ab3a30>

In [208]:
# Quite a few!

# If you wish to save the dataframe, you can use the .to_csv() method from Pandas