## Setup

Install Kingfisher Colab and required packages:

In [None]:
%%shell

pip install --upgrade 'ocdskingfishercolab<0.4' pandas psycopg2-binary > pip.log

Import functions:

In [None]:
from ocdskingfishercolab import (
    list_source_ids,
    list_collections,
    set_spreadsheet_name,
    save_dataframe_to_sheet,
    set_search_path)

Load [ipython-sql](https://pypi.org/project/ipython-sql/) and [data_table](https://colab.research.google.com/notebooks/data_table.ipynb) extensions. Set config.

In [None]:
%load_ext sql
%load_ext google.colab.data_table
%config SqlMagic.autopandas = True  # Return Pandas DataFrames instead of regular result sets
%config SqlMagic.displaycon = False  # Don't show connection string after execute
%config SqlMagic.feedback = False  # Don't print number of rows affected by DML

Enter credentials and connect to database:

> **Helpdesk analysts:** See [CRM-6335](https://crm.open-contracting.org/issues/6335).

In [None]:
import getpass

print('Enter your Kingfisher credentials')
user = input('Username:')
password = getpass.getpass('Password:')

connection_string = 'postgresql://' + user + ':' + password + '@postgres-readonly.kingfisher.open-contracting.org/ocdskingfisherprocess?sslmode=require'

%sql $connection_string

Generate a list of schemas and their selected collections:

In [None]:
%%capture collections

import pandas as pd

# Get a list of schemas that contain the `selected_collections` table

list_schemas = """

SELECT
	schemaname
FROM
	pg_tables
WHERE
	tablename = 'selected_collections'

"""

schemas = %sql {list_schemas}

# Get the selected collections from each schema and store the results in a DataFrame

template = """

SELECT
  '{schema}' as schema_name,
  array_agg(id) as collections
FROM
  {schema}.selected_collections

  """

collections_list = pd.DataFrame()

for schema in schemas['schemaname'].to_list():

  statement = template.format(schema = schema)

  collections = %sql {statement}
  collections_list = collections_list.append(collections)


Log errors:

In [None]:
# Some schemas listed in `pg_tables` (and `information_schema.views`) are not accesible, log those errors and warn the user

if len(collections.stdout) > 0:
  print('`selected_collections` is not accessible for some schemas. See collections.log for details')
  %store collections.stdout > collections.log