In [1]:
from __future__ import absolute_import, division, print_function, unicode_literals

import logging

import cipy

logger = logging.getLogger()
handler = logger.handlers[0]
handler.setLevel(logging.CRITICAL)

conn_creds = cipy.db.get_conn_creds('DATABASE_URL')
pgdb = cipy.db.PostgresDB(conn_creds)

---

## User Management

- create new user accounts and store passwords in a secure format
- delete existing user accounts (along with any owned projects)
- user login

In [2]:
%run ../scripts/create_user.py --test

Enter user name: Burton DeWilde
Enter user email: burtdewilde@gmail.com
Confirm user email: burtdewilde@gmail.com
Enter password: ········
Confirm password: ········


2016-07-11 22:02:15,274 - create_user - INFO - created user (TEST): {'name': 'Burton DeWilde', 'owned_project_ids': None, 'email': 'burtdewilde@gmail.com', 'project_ids': None, 'created_ts': '2016-07-12T02:02:03.990354Z', 'password': 'test'}


In [18]:
list(pgdb.run_query('SELECT * from users'))

[{'created_ts': datetime.datetime(2016, 7, 7, 17, 56, 7),
  'email': 'burtdewilde@gmail.com',
  'name': 'Burton DeWilde',
  'owned_project_ids': [1],
  'password': '$2a$06$5qfLF4y/sfXkc8XhZ360i.48V5GaQfxF5Uy8zVJcO6dLmUqX9JGie',
  'project_ids': [1],
  'user_id': 1}]

In [3]:
%run ../scripts/login_user.py

Enter email: burtdewilde@gmail.com
Enter password: ········


2016-07-11 22:02:22,761 - login_user - INFO - Welcome, Burton DeWilde id=1


In [6]:
%run ../scripts/delete_user.py --user_id=1 --test

Conservation International test project (2016-07-07)


Continue anyway (y/n)? y


2016-07-11 22:03:43,419 - delete_project - INFO - deleted owned project id=1 (TEST)
2016-07-11 22:03:43,420 - delete_project - INFO - deleted user id=1 from projects (TEST)
2016-07-11 22:03:43,420 - delete_project - INFO - deleted user id=1 (TEST)


---

## Review Management

- create new reviews (with user as owner)
- delete existing owned reviews
- invite/uninvite other users to collaborate on existing reviews
- assign other user as owned review's new owner

In [24]:
%run ../scripts/create_project.py --user_id=1 --test

Enter project name: foo
Enter project description (optional): bar


2016-07-08 14:11:33,108 - create_project - INFO - created project (TEST): {'created_ts': '2016-07-07T18:50:49.380396Z', 'name': 'foo', 'owner_user_id': 1, 'description': 'bar', 'user_ids': [1]}


In [7]:
%run ../scripts/delete_project.py --user_id=1 --project_id=1 --test

2016-07-11 22:04:22,642 - delete_project - INFO - deleted project id=1 (TEST)


---

## Review Planning

- facilitate systematic review planning while also gathering structured data that informs and is informed by the citation pre-screening process; user entry of the following fields:
    - objective
    - research questions, ranked
    - PICO statements
    - grouped keyterms (with automatic boolean search query generation)
    - data sources
    - inclusion/exclusion criteria, with shorthand labels

In [4]:
%run ../scripts/plan_review.py --user_id=1 --project_id=1 --test

Enter review objective:
Scope and identify studies that document and/or measure the impacts of nature conservation interventions on human well-being at local to regional scales.
Enter research question:
What are the impacts of nature conservation interventions on different domains of human well-being in developing countries?
Add another question (y/n)? y
Enter research question:
What is the current state and distribution of evidence?
Add another question (y/n)? y
Enter research question:
What types of impacts from conservation interventions on human well-being are measured?
Add another question (y/n)? n


... etc. This is very much work-in-progress.

---

## Citation Ingestion and De-duplication

- load citations from RIS or BibTex files then parse, standardize, sanitize, validate, and store the data
- identify duplicate citations using a sophisticated model and assign the most complete record in a set of duplicates as the "canonical" record

In [13]:
%run ../scripts/ingest_citations.py --citations ../data/raw/citation_files/phase_2_demo_citations.ris --user_id=1 --project_id=1 --test

2016-07-11 22:24:20,813 - ingest_citations - INFO - parsing records in ../data/raw/citation_files/phase_2_demo_citations.ris
2016-07-11 22:24:20,814 - ingest_citations - INFO - valid record: Ecological protection and well-being, 2013
2016-07-11 22:24:20,815 - ingest_citations - INFO - valid record: The economic value of forest ecosystems, 2001
2016-07-11 22:24:20,816 - ingest_citations - INFO - valid record: Contribution of tourism development to protected area management: Local stakeholder perspectives, 2009
2016-07-11 22:24:20,817 - ingest_citations - INFO - 3 valid records inserted into appname db (TEST)


In [17]:
num_citations = list(pgdb.run_query('SELECT COUNT(1) FROM citations WHERE project_id = 1'))[0]['count']
print('total # citations =', num_citations)

total # citations = 28709


In [7]:
%run ../scripts/dedupe_records.py --project_id=1 --threshold=auto --settings=../models/dedupe_citations_settings --test

2016-07-11 22:19:58,881 - dedupe_records - INFO - reading dedupe settings from ../models/dedupe_citations_settings
2016-07-11 22:20:19,236 - dedupe_records - INFO - duplicate threshold = 0.827943
2016-07-11 22:20:19,879 - dedupe_records - INFO - found 361 duplicate clusters
2016-07-11 22:20:20,020 - dedupe_records - INFO - inserted 726 records into duplicates db (TEST)


In [2]:
query = """
SELECT canonical_citation_id, array_agg(citation_id) AS citation_ids, AVG(duplicate_score) AS avg_score
FROM duplicates
GROUP BY 1 HAVING AVG(duplicate_score) < 0.9 ORDER BY 1 ASC
LIMIT 1
"""
dupes = list(pgdb.run_query(query))[0]
print('citations {} are duplicates with avg. duplicate score = {}'.format(
        dupes['citation_ids'], round(dupes['avg_score'], 6)))

query = """
SELECT citation_id, authors, title, abstract, publication_year, doi
FROM citations
WHERE citation_id = ANY(%(citation_ids)s)
"""
for record in pgdb.run_query(query, {'citation_ids': dupes['citation_ids']}):
    cipy.db.db_utils.present_citation(record)

citations [497, 496] are duplicates with avg. duplicate score = 0.827182

TITLE:    INTEGRATIVE SOCIAL WORK APPROACH AS A CONTEXT FOR UNDERSTANDING THE INDIVIDUAL SOCIAL CARE PLAN
YEAR:     2009
AUTHORS:  Ajdukovic, M; Urbanc, K
ABSTRACT: The article deals with the issue of introducing the individual social core plan as one of the initiatives pertaining to the long-awaited social core system reform. The ideas of the individual social care plan are placed within a theoretical framework of the integrative social work approach and in the context of numerous changes which have occurred in the lost twenty years at the conceptual and practical level of offering integrated and coordinated services to service users, ie case management, care management, person-centred planning, etc. Based on the experience that, in order to change the organisation of the Centres for Social Care and develop efficient social care services, the changes to the legal framework are not sufficient but it is necessary 

---

## Initial Ranking of Citations

- sample citations ranked by overlap with keyterms; user pre-screens citations until 10 have been included and 10 have been excluded
- based on included/excluded citations, rank citations by ratio of relevant to irrelevant keyterms and present those most likely to be relevant to the user for pre-screening

---

## Refinement of Search Keyterms

- based on included/excluded citations, create lists of strongly relevant and irrelevant keyterms that can be used to refine initial set of keyterms