## Sources and details of the dataset

Foursquare is a social media platform where users can share their locations and rate venues. This dataset contains 09/2013's values.

You need to download the dataset with torrent to access it.

> https://archive.org/download/201309_foursquare_dataset_umn/201309_foursquare_dataset_umn_archive.torrent

<img src="images/raw_dataset.png">

### [Content of Files](https://archive.org/details/201309_foursquare_dataset_umn)

 * users.dat: Consists of a set of users such that each user has a unique id and a geospatial location (latitude and longitude) that represents the user home town location.
 * venues.dat: Consists of a set of venues (e.g., restaurants) such that each venue has a unique id and a geospatial location (lattude and longitude).
 * checkins.dat: Marks the checkins (visits) of users at venues. Each check-in has a unique id as well as the user id and the venue id.
 * socialgraph.dat: Contains the social graph edges (connections) that exist between users. Each social connection consists of two users (friends) represented by two unique ids (first_user_id and second_user_id).
 * ratings.dat: Consists of implicit ratings that quantifies how much a user likes a specific venue.

## A proposition on what, why and how to work with the data

We don't have a clear hypothesis beforehand we will conduct exploratory analysis.

### How we will work with foursquare dataset?

  * Download dataset from source with torrent
  * Clean and transform it with python
  * Import it to PostgreSQL and design schema, tables and indexing for performance improvements.
  * Analyze it with sql queries
  * Visualize the results with python(i.e. matplotlib)
  * Share the report and results with ipython notebook

## Outline of the report

* Introduction and details of the dataset.
* Basic descriptive analysis of each table.
* Social network analysis.
* Reviewers and review activity analysis.
* Venue popularity and rating analysis.

## Accessing data from file system and cleaning it

In [12]:
import time
import math
import os
import re
import filecmp

source_dir = os.getcwd()
raw_data_folder = source_dir+"\\data\\raw"
clean_data_folder = source_dir+"\\data\\clean"

for filename in filecmp.dircmp(raw_data_folder, clean_data_folder).common:
    print("{filename} already exists.".format(filename=filename))

for filename in filecmp.dircmp(raw_data_folder, clean_data_folder).left_only: #this prevents processing already cleaned data, swap with below comment if you want to process it anyway
#for filename in os.listdir(raw_data_folder):
    with open(os.path.join(raw_data_folder, filename), 'r') as f: #open in readonly mode
        start_time = math.trunc(time.time())
        clean_file = open(os.path.join(clean_data_folder,filename),"w")
        line_number = 0
        for line in f:
            line_number += 1 #Don't carry this to end of the elif statements otherwise line number will be zero indefinitely
            if(line_number in range(1,3)):
                #print("Deleted line: {line_number}".format(line_number=line_number))
                continue
            elif(line.endswith("rows)\n")):
                break
            else:
                clean_format = line.replace(" ","").replace("|",";")
                clean_file.write(clean_format)
        clean_file.close()
        end_time = math.trunc(time.time())
        print("Cleaned and created {filename} in {execute_time} seconds".format(execute_time=end_time-start_time, filename=filename))

Cleaned and created checkins.dat in 3 seconds
Cleaned and created ratings.dat in 5 seconds
Cleaned and created socialgraph.dat in 69 seconds
Cleaned and created users.dat in 7 seconds
Cleaned and created venues.dat in 2 seconds


In [13]:
import psycopg2

pghost = "localhost"
pguser = "postgres"
pgdatabase = "MEF-BDA-PROD"
pgport = "5432"
pgpassword = "123"
#Normally you shouldn't keep database connections in your source code but this one is ok because we are only going to work in local.

conn_string = 'host={pghost} port={pgport} dbname={pgdatabase} user={pguser} password={pgpassword}'.format(pgdatabase=pgdatabase,pguser=pguser,pgpassword=pgpassword,pghost=pghost,pgport=pgport)
conn=psycopg2.connect(conn_string)
cur=conn.cursor()

def check_if_table_exists(schema,table):
    cur.execute("select exists(select * from information_schema.tables where table_schema='{schema}' AND table_name='{table}')".format(schema=schema, table=table))
    return cur.fetchone()[0]

def check_if_index_exists(index):
    cur.execute("SELECT EXISTS(SELECT * FROM PG_CLASS WHERE relname = '{index}')".format(index=index))
    return cur.fetchone()[0]

if(check_if_table_exists('ODS','EXT_FS_USERS')):
    print('Table ODS.EXT_FS_USERS already exists.')   
else:
    start_time = math.trunc(time.time())
    cur.execute("""
    CREATE TABLE "ODS"."EXT_FS_USERS"
    (
    id integer,
    latitude double precision,
    longitude double precision
    )

    TABLESPACE pg_default;

    ALTER TABLE "ODS"."EXT_FS_USERS"
    OWNER to postgres;
    """)
    end_time = math.trunc(time.time())
    cur.execute('COMMIT;')
    print("Table ODS.EXT_FS_USERS created in {execute_time} seconds.".format(execute_time=end_time-start_time))
    
    #start_time = math.trunc(time.time())
    #cmd_command = """"C:\\Program Files\\PostgreSQL\\13\\bin\\psql.exe" -h {pghost} -U {pguser} -d {pgdatabase} -p {pgport};
    #{pgpassword}; 
    #\COPY "ODS"."EXT_FS_VENUES" FROM '{datasource}' WITH (FORMAT CSV, DELIMITER ';');
    #""".format(pgdatabase=pgdatabase,pguser=pguser,pgpassword=pgpassword,pghost=pghost,pgport=pgport,datasource=clean_data_folder.replace('\\','/')+"/users.dat")
    #print(cmd_command)
    #os.system('cmd /k {cmd_command}'.format(cmd_command=cmd_command))
    #end_time = math.trunc(time.time())
    #print("Imported data to ODS.EXT_FS_USERS in {execute_time} seconds.".format(execute_time=end_time-start_time))

if(check_if_table_exists('ODS','EXT_FS_VENUES')):
    print('Table ODS.EXT_FS_VENUES already exists.')   
else:
    start_time = math.trunc(time.time())
    cur.execute("""
    CREATE TABLE "ODS"."EXT_FS_VENUES" AS 
    (
    id integer,
    latitude double precision,
    longitude double precision
    )
    
    TABLESPACE pg_default;

    ALTER TABLE "ODS"."EXT_FS_VENUES"
    OWNER to postgres;
    """)
    cur.execute('COMMIT;')
    end_time = math.trunc(time.time())
    print("Table ODS.EXT_FS_VENUES created in {execute_time} seconds.".format(execute_time=end_time-start_time))

if(check_if_table_exists('ODS','EXT_FS_SOCIALGRAPH')):
    print('Table ODS.EXT_FS_SOCIALGRAPH already exists.')   
else:
    start_time = math.trunc(time.time())
    cur.execute("""
    CREATE TABLE "ODS"."EXT_FS_SOCIALGRAPH"
    (
    first_user_id integer,
    second_user_id integer
    )

    TABLESPACE pg_default;

    ALTER TABLE "ODS"."EXT_FS_SOCIALGRAPH"
    OWNER to postgres;
    """)
    end_time = math.trunc(time.time())
    cur.execute('COMMIT;')
    print("Table ODS.EXT_FS_SOCIALGRAPH created in {execute_time} seconds.".format(execute_time=end_time-start_time))

if(check_if_table_exists('ODS','EXT_FS_RATINGS')):
    print('Table ODS.EXT_FS_RATINGS already exists.')   
else:
    start_time = math.trunc(time.time())
    cur.execute("""
    CREATE TABLE "ODS"."EXT_FS_RATINGS"
    (
    user_id integer,
    venue_id integer,
    rating integer
    )

    TABLESPACE pg_default;

    ALTER TABLE "ODS"."EXT_FS_RATINGS"
    OWNER to postgres;
    """)
    end_time = math.trunc(time.time())
    cur.execute('COMMIT;')
    print("Table ODS.EXT_FS_RATINGS created in {execute_time} seconds.".format(execute_time=end_time-start_time))

if(check_if_table_exists('ODS','EXT_FS_CHECKINS')):
    print('Table ODS.EXT_FS_CHECKINS already exists.')   
else:
    start_time = math.trunc(time.time())
    cur.execute("""
    CREATE TABLE "ODS"."EXT_FS_CHECKINS"
    (
    id integer,
    user_id integer,
    venue_id integer,
    latitude double precision,
    longitude double precision,
    created_at text COLLATE pg_catalog."default"
    )

    TABLESPACE pg_default;

    ALTER TABLE "ODS"."EXT_FS_CHECKINS"
    OWNER to postgres;
    """)
    end_time = math.trunc(time.time())
    cur.execute('COMMIT;')
    print("Table ODS.EXT_FS_CHECKINS created in {execute_time} seconds.".format(execute_time=end_time-start_time))

Table ODS.EXT_FS_USERS already exists.
Table ODS.EXT_FS_VENUES already exists.
Table ODS.EXT_FS_SOCIALGRAPH created in 0 seconds.
Table ODS.EXT_FS_RATINGS already exists.
Table ODS.EXT_FS_CHECKINS created in 0 seconds.


Lets import the data with;

```postgresql
\COPY "ODS"."EXT_FS_USERS" FROM 'C:/Users/ahmet\Desktop\Data-ML\MEF-BDA\BDA-505\data\clean/users.dat' WITH (FORMAT CSV, DELIMITER ';');
\COPY "ODS"."EXT_FS_VENUES" FROM '[FilePath]/venues.dat' WITH (FORMAT CSV, DELIMITER ';', FORCE_NULL(latitude,longitude));
\COPY "ODS"."EXT_FS_SOCIALGRAPH" FROM '[FilePath]/socialgraph.dat' WITH (FORMAT CSV, DELIMITER ';');
\COPY "ODS"."EXT_FS_RATINGS" FROM '[FilePath]/ratings.dat' WITH (FORMAT CSV, DELIMITER ';');
\COPY "ODS"."EXT_FS_CHECKINS" FROM '[FilePath]/checkins.dat' WITH (FORMAT CSV, DELIMITER ';');
``` 

**You should use PSQL(SQL Shell) instead of IDE because \COPY command only works there, also \COPY command should not be mistaken with COPY command.**

We imported foursquare data to postgresql tables but there is a one thing we should do before proceeding to work with it.

There was an issue while importing date time values so we imported them as text values, we can create a new table with;

```postgresql
CREATE TABLE TABLE_NAME AS
SELECT * 
FROM OLD_TABLE_NAME; 
```

In [15]:
if(check_if_table_exists('EDW','EXT_FS_CHECKINS')):
    print('Table EDW.EXT_FS_CHECKINS already exists.')   
else:
    start_time = math.trunc(time.time())
    cur.execute("""
    CREATE TABLE "EDW"."EXT_FS_CHECKINS" AS
    SELECT
    id,
    user_id,
    venue_id,
    latitude,
    longitude,
    to_timestamp(created_at, 'YYYY-MM-DDhh24:mi:ss') AS created_at
    FROM "ODS"."EXT_FS_CHECKINS";
    """)
    end_time = math.trunc(time.time())
    cur.execute('COMMIT;')
    print("Table EDW.EXT_FS_CHECKINS created in {execute_time} seconds.".format(execute_time=end_time-start_time))

Table EDW.EXT_FS_CHECKINS already exists.


In [23]:
import pandas as pd

sql_command = """
SELECT * 
FROM "{schema}"."{table}"
LIMIT 200;
""".format(schema='ODS', table='EXT_FS_VENUES')
df = pd.read_sql(sql_command,conn)
df

DatabaseError: Execution failed on sql '
SELECT *
FROM
(
	SELECT popular_situation, venue_id, avg_rating, venue_checkings_number, venue_ratings_num,
		ROW_NUMBER() OVER(PARTITION BY popular_situation ORDER BY avg_rating desc) AS row_num
	FROM
	(
		SELECT popular_situation, venue_id, ROUND(AVG(rating),2) AS avg_rating, 
			ROUND(AVG(venue_checkings_number),0) as venue_checkings_number, 
			ROUND(AVG(venue_ratings_num),0) as venue_ratings_num
		FROM
		(
			WITH socialism_dt AS
			(
			SELECT user_id, friends_num, 
				CASE
				 WHEN friends_num > 10000 THEN 'Very Popular'
				 WHEN (friends_num <= 10000) AND (friends_num > 1000) THEN 'Popular'
				 WHEN (friends_num <= 1000) AND (friends_num > 200) THEN 'Average'
				 WHEN (friends_num <= 200) AND (friends_num > 20) THEN 'Not Popular'
				 WHEN (friends_num <= 20) AND (friends_num > 0) THEN 'Asocial' 
				END AS popular_situation
			FROM
			(
				SELECT first_user_id AS user_id, COUNT(*) AS friends_num
				FROM "ODS"."EXT_FS_SOCIALGRAPH"
				GROUP BY first_user_id
			) AS dt3
			),
			ratings_pop AS
			(
			SELECT fr.*, s.friends_num, s.popular_situation
			FROM "ODS"."EXT_FS_RATINGS" AS fr
			LEFT JOIN socialism_dt AS s
			ON s.user_id = fr.user_id
			),
			venue_ratings AS
			(
			SELECT venue_id, COUNT(*) AS venue_ratings_num
			FROM "ODS"."EXT_FS_RATINGS"
			GROUP BY venue_id
			),
			venue_checkings AS
			(
			SELECT venue_id, COUNT(*) AS venue_checkings_number
			FROM "EDW"."EXT_FS_CHECKINS"
			GROUP BY venue_id
			)
			SELECT rp.*, vc.venue_checkings_number, vr.venue_ratings_num 
			FROM ratings_pop AS rp
			LEFT JOIN venue_checkings AS vc ON rp.venue_id = vc.venue_id
			LEFT JOIN venue_ratings AS vr ON vr.venue_id = rp.venue_id
			WHERE vc.venue_checkings_number > 49
			AND vr.venue_ratings_num>49
		) AS dt4
		GROUP BY popular_situation, venue_id
	) AS dt5
) AS dt6
WHERE row_num < 4
AND popular_situation IS NOT NULL0;
': syntax error at or near "NULL0"
LINE 61: AND popular_situation IS NOT NULL0;
                                      ^


In [50]:
sql_command = """
SELECT *
FROM
(
	SELECT popular_situation, venue_id, avg_rating, venue_checkings_number, venue_ratings_num,
		ROW_NUMBER() OVER(PARTITION BY popular_situation ORDER BY avg_rating desc) AS row_num
	FROM
	(
		SELECT popular_situation, venue_id, ROUND(AVG(rating),2) AS avg_rating, 
			ROUND(AVG(venue_checkings_number),0) as venue_checkings_number, 
			ROUND(AVG(venue_ratings_num),0) as venue_ratings_num
		FROM
		(
			WITH socialism_dt AS
			(
			SELECT user_id, friends_num, 
				CASE
				 WHEN friends_num > 10000 THEN 'Very Popular'
				 WHEN (friends_num <= 10000) AND (friends_num > 1000) THEN 'Popular'
				 WHEN (friends_num <= 1000) AND (friends_num > 200) THEN 'Average'
				 WHEN (friends_num <= 200) AND (friends_num > 20) THEN 'Not Popular'
				 WHEN (friends_num <= 20) AND (friends_num > 0) THEN 'Asocial' 
				END AS popular_situation
			FROM
			(
				SELECT first_user_id AS user_id, COUNT(*) AS friends_num
				FROM "ODS"."EXT_FS_SOCIALGRAPH"
				GROUP BY first_user_id
			) AS dt3
			),
			ratings_pop AS
			(
			SELECT fr.*, s.friends_num, s.popular_situation
			FROM "ODS"."EXT_FS_RATINGS" AS fr
			LEFT JOIN socialism_dt AS s
			ON s.user_id = fr.user_id
			),
			venue_ratings AS
			(
			SELECT venue_id, COUNT(*) AS venue_ratings_num
			FROM "ODS"."EXT_FS_RATINGS"
			GROUP BY venue_id
			),
			venue_checkings AS
			(
			SELECT venue_id, COUNT(*) AS venue_checkings_number
			FROM "EDW"."EXT_FS_CHECKINS"
			GROUP BY venue_id
			)
			SELECT rp.*, vc.venue_checkings_number, vr.venue_ratings_num 
			FROM ratings_pop AS rp
			LEFT JOIN venue_checkings AS vc ON rp.venue_id = vc.venue_id
			LEFT JOIN venue_ratings AS vr ON vr.venue_id = rp.venue_id
			WHERE vc.venue_checkings_number > 49
			AND vr.venue_ratings_num>49
		) AS dt4
		GROUP BY popular_situation, venue_id
	) AS dt5
) AS dt6
Where popular_situation IS NOT NULL;
"""

df_2 = pd.read_sql(sql_command,conn)
df_vis=df_2[['popular_situation', 'avg_rating', 'venue_checkings_number']]

In [51]:
from bokeh.plotting import figure, output_notebook, show
output_notebook()

In [52]:
# Data - Avg Rating vs Venue Checking
import pandas as pd
avg_rating = df_vis['avg_rating']
venue_checkings_number = df_vis['venue_checkings_number']

In [53]:
# Create data for Social
avg_rating_asocial = df_vis['avg_rating'].loc[df_vis['popular_situation']=='Asocial']
venue_checkings_number_asocial = df_vis['venue_checkings_number'].loc[df_vis['popular_situation']=='Asocial']
avg_rating_not_popular = df_vis['avg_rating'].loc[df_vis['popular_situation']=='Not Popular']
venue_checkings_number_not_popular = df_vis['venue_checkings_number'].loc[df_vis['popular_situation']=='Not Popular']
avg_rating_average = df_vis['avg_rating'].loc[df_vis['popular_situation']=='Average']
venue_checkings_number_average = df_vis['venue_checkings_number'].loc[df_vis['popular_situation']=='Average']
avg_rating_popular = df_vis['avg_rating'].loc[df_vis['popular_situation']=='Popular']
venue_checkings_number_popular = df_vis['venue_checkings_number'].loc[df_vis['popular_situation']=='Popular']
avg_rating_very_popular = df_vis['avg_rating'].loc[df_vis['popular_situation']=='Very Popular']
venue_checkings_number_very_popular = df_vis['venue_checkings_number'].loc[df_vis['popular_situation']=='Very Popular']

In [54]:
# Create the figure: p
p = figure(x_axis_label='Checking Number of Venue',y_axis_label='Average Rating')

# Add a blue circle glyph to the figure p, set size to 10 and alpha to 0.7. Add legend 'Latin America'
p.circle(venue_checkings_number_asocial, avg_rating_asocial, 
         size=10, 
         alpha = 0.7, 
         color ='blue', 
         legend='Asocial')


# Add a red circle glyph to the figure p, set size to 10 and alpha to 0.7. Add legend 'Africa'
p.circle(venue_checkings_number_not_popular, avg_rating_not_popular, 
         size = 10, 
         alpha = 0.7,
         color ='red',
         legend = 'Not Popular')

# Add a red circle glyph to the figure p, set size to 10 and alpha to 0.7. Add legend 'Africa'
p.circle(venue_checkings_number_average, avg_rating_average, 
         size = 10, 
         alpha = 0.7,
         color ='green',
         legend = 'Average')

# Add a red circle glyph to the figure p, set size to 10 and alpha to 0.7. Add legend 'Africa'
p.circle(venue_checkings_number_popular, avg_rating_popular, 
         size = 10, 
         alpha = 0.7,
         color ='yellow',
         legend = 'Popular')

# Add a red circle glyph to the figure p, set size to 10 and alpha to 0.7. Add legend 'Africa'
p.circle(venue_checkings_number_very_popular, avg_rating_very_popular, 
         size = 10, 
         alpha = 0.7,
         color ='black',
         legend = 'very Popular')

# add legend click_policy
p.legend.click_policy='hide'

# Display the plot
show(p)

With this graph, we can see the Checking Number vs Avg Ratings of Venue for each user type (very popular, popular, average, not popular, asocial). This graph is interactive. In pyhton, you can select user type like combo-box

In [22]:
import pandas as pd

sql_command = """
SELECT *
FROM
(
	SELECT popular_situation, venue_id, avg_rating, venue_checkings_number, venue_ratings_num,
		ROW_NUMBER() OVER(PARTITION BY popular_situation ORDER BY avg_rating desc) AS row_num
	FROM
	(
		SELECT popular_situation, venue_id, ROUND(AVG(rating),2) AS avg_rating, 
			ROUND(AVG(venue_checkings_number),0) as venue_checkings_number, 
			ROUND(AVG(venue_ratings_num),0) as venue_ratings_num
		FROM
		(
			WITH socialism_dt AS
			(
			SELECT user_id, friends_num, 
				CASE
				 WHEN friends_num > 10000 THEN 'Very Popular'
				 WHEN (friends_num <= 10000) AND (friends_num > 1000) THEN 'Popular'
				 WHEN (friends_num <= 1000) AND (friends_num > 200) THEN 'Average'
				 WHEN (friends_num <= 200) AND (friends_num > 20) THEN 'Not Popular'
				 WHEN (friends_num <= 20) AND (friends_num > 0) THEN 'Asocial' 
				END AS popular_situation
			FROM
			(
				SELECT first_user_id AS user_id, COUNT(*) AS friends_num
				FROM "ODS"."EXT_FS_SOCIALGRAPH"
				GROUP BY first_user_id
			) AS dt3
			),
			ratings_pop AS
			(
			SELECT fr.*, s.friends_num, s.popular_situation
			FROM "ODS"."EXT_FS_RATINGS" AS fr
			LEFT JOIN socialism_dt AS s
			ON s.user_id = fr.user_id
			),
			venue_ratings AS
			(
			SELECT venue_id, COUNT(*) AS venue_ratings_num
			FROM "ODS"."EXT_FS_RATINGS"
			GROUP BY venue_id
			),
			venue_checkings AS
			(
			SELECT venue_id, COUNT(*) AS venue_checkings_number
			FROM "EDW"."EXT_FS_CHECKINS"
			GROUP BY venue_id
			)
			SELECT rp.*, vc.venue_checkings_number, vr.venue_ratings_num 
			FROM ratings_pop AS rp
			LEFT JOIN venue_checkings AS vc ON rp.venue_id = vc.venue_id
			LEFT JOIN venue_ratings AS vr ON vr.venue_id = rp.venue_id
			WHERE vc.venue_checkings_number > 49
			AND vr.venue_ratings_num>49
		) AS dt4
		GROUP BY popular_situation, venue_id
	) AS dt5
) AS dt6
WHERE row_num < 4
AND popular_situation IS NOT NULL
ORDER BY popular_situation, avg_rating desc
LIMIT 20;
"""
df = pd.read_sql(sql_command,conn)
df

Unnamed: 0,popular_situation,venue_id,avg_rating,venue_checkings_number,venue_ratings_num,row_num
0,Asocial,33177,3.0,115.0,318.0,1
1,Asocial,6339,2.99,86.0,307.0,2
2,Asocial,73804,2.93,66.0,128.0,3
3,Average,501132,5.0,54.0,55.0,1
4,Average,110033,5.0,74.0,83.0,2
5,Average,94278,5.0,121.0,136.0,3
6,Not Popular,46646,5.0,50.0,51.0,1
7,Not Popular,565838,4.5,50.0,52.0,2
8,Not Popular,333967,4.33,59.0,62.0,3
9,Popular,112490,5.0,75.0,77.0,1



With this query, 
Firstly we defined 5 user types depends on friends number. Rules showed below;
friends_num > 10000 >> 'Very Popular'
(friends_num <= 10000) AND (friends_num > 1000) >> 'Popular'
(friends_num <= 1000) AND (friends_num > 200) >> 'Average'
(friends_num <= 200) AND (friends_num > 20) >> 'Not Popular'
(friends_num <= 20) AND (friends_num > 0) >> 'Asocial' 

for each category, we can find the 3 place with most ratings if number of checkings >= 50 and ratings num >=50. As we can see in the output, most rating place has only 3.00 for asocial users.

Columns Explanations;
popular_situation: user staus for social
venue_id: Place id
avg_rating: Average rating of venue
venue_checkings_number: Checking number of venue
venue_ratings_num: number of ratings for venue
row_num: ranking of place 

In [18]:
sql_command

'\nSELECT * \nFROM "ODS"."EXT_FS_CHECKINS"\nLIMIT 20;\n'

In [10]:
cur.close()
conn.close()

## Sources:

 1. [Dataset](https://archive.org/details/201309_foursquare_dataset_umn)

 2. [How to open every file in a folder?](https://stackoverflow.com/questions/18262293/how-to-open-every-file-in-a-folder)

 3. [Python- reading a file line by line and processing](https://stackoverflow.com/questions/53749062/python-reading-a-file-line-by-line-and-processing)

 4. [File and Directory Comparisons with Python](https://janakiev.com/blog/python-filecmp/)

 5. [Checking if a table exist with psycopg2 on postgreSQL](https://stackoverflow.com/questions/1874113/checking-if-a-postgresql-table-exists-under-python-and-probably-psycopg2)

 6. [Checking if index exist](https://stackoverflow.com/questions/45983169/checking-for-existence-of-index-in-postgresql)