<a href="https://colab.research.google.com/github/aamanirp/Replica/blob/main/BigQuery_bquxjob_4abc3503_189757f7505.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [3]:
# @title Setup
from google.colab import auth
from google.cloud import bigquery
from google.colab import data_table

project = 'replica-customer' # Project ID inserted based on the query results selected to explore
location = 'US' # Location inserted based on the query results selected to explore
client = bigquery.Client(project=project, location=location)
data_table.enable_dataframe_formatter()
auth.authenticate_user()

## Reference SQL syntax from the original job
Use the ```jobs.query```
[method](https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query) to
return the SQL syntax from the job. This can be copied from the output cell
below to edit the query now or in the future. Alternatively, you can use
[this link](https://console.cloud.google.com/bigquery?j=replica-customer:US:bquxjob_4abc3503_189757f7505)
back to BigQuery to edit the query within the BigQuery user interface.

In [4]:
# Running this code will display the query used to generate your previous job

job = client.get_job('bquxjob_4abc3503_189757f7505') # Job ID inserted based on the query results selected to explore
print(job.query)

SELECT destination_bgrp_lat, destination_bgrp_lng
FROM `replica-customer.south_central.south_central_2021_Q4_thursday_trip` 
WHERE mode ="COMMERCIAL" AND distance_miles>15


# Result set loaded from BigQuery job as a DataFrame
Query results are referenced from the Job ID ran from BigQuery and the query
does not need to be re-run to explore results. The ```to_dataframe```
[method](https://googleapis.dev/python/bigquery/latest/generated/google.cloud.bigquery.job.QueryJob.html#google.cloud.bigquery.job.QueryJob.to_dataframe)
downloads the results to a Pandas DataFrame by using the BigQuery Storage API.

To edit query syntax, you can do so from the BigQuery SQL editor or in the
```Optional:``` sections below.

In [5]:
# Running this code will read results from your previous job

job = client.get_job('bquxjob_4abc3503_189757f7505') # Job ID inserted based on the query results selected to explore
results = job.to_dataframe()
results



Unnamed: 0,destination_bgrp_lat,destination_bgrp_lng
0,31.338349,-87.256995
1,31.877050,-90.383336
2,35.829926,-87.497479
3,34.526500,-87.675578
4,33.653783,-85.120298
...,...,...
1678735,33.159151,-84.888487
1678736,33.498543,-86.822869
1678737,32.981412,-94.278457
1678738,36.384685,-82.380565


## Show descriptive statistics using describe()
Use the ```pandas DataFrame.describe()```
[method](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.describe.html)
to generate descriptive statistics. Descriptive statistics include those that
summarize the central tendency, dispersion and shape of a dataset’s
distribution, excluding ```NaN``` values. You may also use other Python methods
to interact with your data.

In [6]:
results.describe()

Unnamed: 0,destination_bgrp_lat,destination_bgrp_lng
count,1678740.0,1678740.0
mean,33.55441,-88.87798
std,2.089961,2.971905
min,29.13546,-95.11684
25%,31.86445,-91.13352
50%,33.75478,-88.781
75%,35.3633,-86.58591
max,37.05418,-81.24693


### Map Marker Clusters


In [None]:
import os
import folium
from folium.plugins import MarkerCluster


map_center=[results['destination_bgrp_lat'].iloc[0],results['destination_bgrp_lng'].iloc[0]]
map_cluster=folium.Map(location=map_center, zoom_start=5)

#Map cluster group

marker_cluster=MarkerCluster().add_to(map_cluster)


# Markers to cluster group

for index, row in results.iterrows():
  pop_text = f"Lat:{row['destination_bgrp_lat']}<br>Lon:{row['destination_bgrp_lng']}"
  folium.Marker([row['destination_bgrp_lat'], row['destination_bgrp_lng']], popup=pop_text).add_to(marker_cluster)

map_cluster.save("cluster_map.html")
