New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spatial and temporal distribution of service calls using big data tools #497
Conversation
Check out this pull request on ReviewNB: https://app.reviewnb.com/Esri/arcgis-python-api/pull/497 You'll be able to see notebook diffs and discuss changes. Powered by ReviewNB. |
@priyankatuteja are you waiting to get BDFS configured on PythonAPI playground portal for this PR? |
@AtmaMani I am waiting to get the sample reviewed and for that, I assume you need data to test the notebook. Please let me know where to add BDFS for this sample so this could be reviewed and approved accordingly |
1 similar comment
@AtmaMani I am waiting to get the sample reviewed and for that, I assume you need data to test the notebook. Please let me know where to add BDFS for this sample so this could be reviewed and approved accordingly |
@priyankatuteja yes please follow up on the email thread you started to get the data published and update the connection strings accordingly. Meanwhile I will review the content |
@jyaistMap could you do the first round of reviews for this PR? |
Hi @priyankatuteja I'm reviewing the sample, but do not have access to |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:19Z I've made a ton of suggestions about adding some explanatory text. I understand this is a sample rather than a guide. It might help clarify at the beginning of this sample what we expect the audience to already know.
I think links to relevant help at the beginning of the sample could benefit users: priyankatuteja commented on 2019-11-09T12:34:29Z will it be a good idea to have a further reading section wherein I could include hyperlinks like you suggested above. The idea is to keep consistency with other samples. This allows a user/reader to skim and scan the sample in order to understand the main idea. |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:20Z Given the wording is "sample aims to find answers to some fundamental questions:", I'd reword the bullet points to the form of questions:
I suggest telling people how to find the specific data at data.gov. I typed a search for |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:21Z
|
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:22Z Change the first word to an active verb to stay consistent: Ensure... |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:22Z Might help to clarify that you've set up a directory and file structure described above at a specific location. |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:23Z I suggest changing the sentence and linking to the manifest documenation:
The Geoanalytics server samples the datasets in the big data file share to generate a |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:24Z Maybe clarify a bit about the spatial reference: but we know this data is from New Orleans, Louisiana, and is actually stored in the Louisiana State Plane Coordinate System. We need to edit the manifest with the correct spatial reference: {"wkid": 102682, "latestWkid": 3452} |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:25Z Use an active verb to stay consistent: Search... |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:25Z I suggest linking to the documentation to clarify terms as they are presented in a sample:
Adding a big data file share to the Geoanalytics server adds a corresponding big data file share item on the portal. We can search for these types of items using the |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:26Z I would add a sentence describing what you are doing here. I'd also clarify the type of Python API object this returns. It looks like it would return a Feature Layer, but the
Querying the |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:27Z Stay consistent: Search... |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:28Z I'd delete this sentence or remove the bold. |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:30Z I'd suggest hyperlinks for the different objects:
The
Optionally, the tool can output a |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:30Z The
Consequently, I received errors on the
I don't know what
I found the output
priyankatuteja commented on 2019-11-10T02:33:46Z I need to test this tool to confirm whether the output type has been changed for some reason. priyankatuteja commented on 2019-11-11T10:45:09Z This solves problem with return_tuple=True |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:31Z I suggest: We can see some records have missing or invalid attribute values, including in the fields the manifests defines as the time and geometry values. |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:32Z typo: New Orlans....should be New Orleans beacause....should be because
Change second sentence: We want to explore data points within New Orleans city limits.
Change third sentence and include a hyperlink: We will use the
Remove the last sentence, it's redundant. |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:32Z I'd move this cell above the paragraph above so the user can see what you're describing. |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:33Z I'd rephrase: Extract features within New Orleans |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:34Z Overlay seems like the wrong word. The |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:34Z Do we know why there is no thumbnail for the priyankatuteja commented on 2019-11-10T02:54:29Z Need to find out the reason. priyankatuteja commented on 2019-11-12T06:10:25Z Looks like it has been there for all tools, see https://developers.arcgis.com/python/sample-notebooks/analyzing-violent-crime/ |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:35Z
|
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:42Z Do we know why the thumbnail is missing? |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:43Z I would add a sentence or two explaining what the data says: The darkest red features indicate areas where you can state with 99 percent confidence that the clustering of 911 call features is not the result of random chance but rather of some other variable that might be worth investigating. Similarly, the darkest blue features indicate that the lack of 911 calls is most likely not just random, but with 90% certainty you can state it is because of some variable in those locations. Features that are beige do not represent statistically significant clustering; the number of 911 calls could very likely be the result of random processes and random chance in those areas.
It might also help to add the Block Group layer over top of the hot spots and symbolize with no color but just a thicker outline that would allow users to zoom in and investigate which block groups have clustering and which don't. |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:44Z I would change the heading to use an active very and simplify to remain consistent with other headings Visualize other aspects |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:45Z Some suggested edits: The Geoanalytics Server installs a Python 3.6 environment that this tool uses. The environment includes When using the |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:45Z Reorder the sentence The
|
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:46Z
The I'm assuming I'm wrong, but perhaps explaining what's happening in the function and how it relates to the
|
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:46Z I suggest being explicit in indicating that the
I also suggest explaining how to find the output on the portal by reading the message output of the tool |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:47Z I would rephrase the above sentence: Now we'll investigate 911 calls to investigate the most common reasons for calls within block addresses. We will define a new function to input to the |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:48Z I would add explanatory text for the lines within the function. |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:49Z Rephrase: The result indicates the most common reason for a 911 call in the 22A Block in New Orleans is defined as
It might also help to tell people they can get more specific information about this data and what these attributes mean by looking for the data themselves and exploring the metatada: Go to data.gov and search for
|
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:49Z I would be more explicit in explaining what's coming up: Now let's investigate the 911 call data for temporal trends. We saw in the manifest that the |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:50Z Let's define a helper function to convert the TimeCreate attribute field string types into a date type. |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:51Z
priyankatuteja commented on 2019-11-11T05:38:14Z since the run_python_script tool is executed directly on GA server site. The imports made in notebook do not work for code within script priyankatuteja commented on 2019-11-11T05:44:24Z The data is too big to use datetime.strptime('TimeCreate', %m/%d/%Y %H:%M:%S). We need distributed platform to perform this. |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:51Z typo: eralier should be earlier |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). jyaistMap commented on 2019-10-04T16:57:52Z I believe the functions written as input to the
I found myself sidetracked and looking up documentation for how the functions were supposed to work. I also have NO experience with data science workflows and distributed analysis with pandas or pyspark, so if we conclude that the audience for this sample will already know this type of information, you can ignore many of my comments :) priyankatuteja commented on 2019-11-11T15:29:58Z I have been trying to add some hyperlinks so our users can quickly look up for more details. It really doesn't seem like you have NO experience with such workflows. These suggestions are really meaningful.Thanks a lot! |
@AtmaMani @priyankatuteja |
@jyaistMap Thanks for the suggestions. I was occupied with python api testing and documentation. Apologies for the delayed response. I will be incorporating these suggestions soon. Also placing the data on \JYAIST2\sample_review for further testing of the sample. |
@jyaistMap I do not have access to \JYAIST2\sample_review. But, I have added you and Atma to have access to \DELDEVAL031\datastore. Please let me know if that still does not work. |
@AtmaMani, @jyaistMap I have made changes to the notebook. Please let me know if any other changes required. |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). AtmaMani commented on 2019-11-14T22:49:53Z "You can obtain data by searching using keywords, for exxample" - typo in word example |
View / edit / reply to this conversation on ReviewNB (backstory for this conversation format). AtmaMani commented on 2019-11-14T22:49:54Z In this 2nd aggregation that runs on pySpark, you are grouping by both priyankatuteja commented on 2019-11-15T16:46:55Z I confirmed it, that is not the case. This |
@priyankatuteja the latest revision looks much better. This is a really good sample. I think we can borrow from this when we rewrite the geoanalytics module guide. I have asked a question about one of the aggregation methods, could you take a look at that? |
@AtmaMani Thanks for the review! I have made the changes you suggested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good
@priyankatuteja thanks for the sample, @jyaistMap thanks for the thorough review |
<insert pull request description here>
Checklist
Please go through each entry in the below checklist and mark an 'X' if that condition has been met. Every entry should be marked with an 'X' to be get the Pull Request approved.
import
s are in the first cell? First block of imports are standard libraries, second block are 3rd party libraries, third block are allarcgis
imports?GIS
object instantiations are one of the following?gis = GIS('https://www.arcgis.com', 'arcgis_python', 'P@ssword123')
gis = GIS(profile="your_online_profile")
gis = GIS('https://pythonapi.playground.esri.com/portal', 'arcgis_python', 'amazing_arcgis_123')
gis = GIS(profile="your_enterprise_portal")
./misc/setup.py
and/or./misc/teardown.py
?<img src="base64str_here">
instead of<img src="https://some.url">
? All map widgets contain a static image preview? (Callmapview_inst.take_screenshot()
to do so)os.path.join()
? (Instead ofr"\foo\bar"
,os.path.join(os.path.sep, "foo", "bar")
, etc.)