Skip to content

Commit

Permalink
docs(connector): add info docs
Browse files Browse the repository at this point in the history
  • Loading branch information
nzrymiak committed Mar 21, 2021
1 parent 59603e5 commit cb8cb5c
Show file tree
Hide file tree
Showing 4 changed files with 139 additions and 0 deletions.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
139 changes: 139 additions & 0 deletions docs/source/user_guide/connector/info.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# ```info()``` Function"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The ```info()``` function provides guidelines for using Connector.\n",
"\n",
"**info() can be called using a Connector object, without parameters:**"
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"from dataprep.connector import connect, info\n",
"\n",
"# Access tokens can be accessed generated here: https://www.yelp.com/developers/documentation/v3/authentication\n",
"dc = connect('yelp', _auth={'access_token':'cCMHU4M4t7rdt*********vp3whGzFjgIKIm0'})\n",
"\n",
"dc.info()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Or by specifying the API to query as a parameter:**"
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"info('yelp')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Parameters\n",
"\n",
"* ```config_path``` is the path to the folder containing configuration files. There are two ways to load configuration files. Details can be found in the previous configuraton file section.\n",
"\n",
"* ```update``` is used to specify if new configuration files should be pulled from the GitHub repo where up to date configuration files are hosted."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Response\n",
"\n",
"* ```Table``` displays table(s) of data that can be accessed. Each table has a corresponding API endpoint which will be queried automatically by connector.\n",
"* ```Parameters``` identifies parameters that can be used in the query function to access specific data. info() indicates if the parameter is required for all queries of the specified table.\n",
"* ```Examples``` shows how methods of the Connector class can be called. The access_token value must be replaced with an authorization key that can be generated by following instructions on the developer website of the API.\n",
"* ```Schema``` displays column names and column data types of the DataFrame returned by the query function."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Examples "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Below is output of the info function for the Yelp API. \n",
"- Yelp has one table called \"businesses\" which contains information such as rating and location data of businesses.\n",
"- The businesses table has seven parameters: location, term, latitude, longitude, limit, categories and sort_by. The location parameter is required and must be specified in each query while the other parameters are optional. These parameters are specific to the businesses table and can be used to access certain types of business data.\n",
"- The example shows how to create a connector object for Yelp, given the _auth parameter to authenticate oneself and the _concurrency parameter to speed up data acquisition. The connector object \"dc\" can then be used to query the API via the businesses table. A dataframe will be returned with 20 rows of data for businesses located in Seattle. More details can be found in the \"connect\" and \"query\" sections.\n",
"- The schema shows there will be 20 columns of data returned when querying the businesses table. Each row of the schema displays a column name and its corresponding data type. For example the \"name\" and \"image_url\" columns contain string data while \"latitude\" and \"longitude\" columns contain float data."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from dataprep.connector import info\n",
"info('yelp', update=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![title](assets/yelp_params_example.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![title](assets/yelp_schema_top.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![title](assets/yelp_schema_bottom.png)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

0 comments on commit cb8cb5c

Please sign in to comment.