# Extracting relations from [CRAFT 3.1](https://github.com/UCDenver-ccp/CRAFT)

This notebook demonstrates how to extract relations using [Dep2Rel](https://github.com/tuh8888/Dep2Rel/) from the [CRAFT 3.1](https://github.com/UCDenver-ccp/CRAFT) dataset.

## The data

[CRAFT 3.1](https://github.com/UCDenver-ccp/CRAFT) contains both semantic and structural annotations. 

### Semantic annotations
Semantic annotations (concept annotations) are used in named entity recognition (NER) tasks. In CRAFT, these were made using 10 of the Open Biomedical Ontologies which serve as formal dictionaries mapping persistent URIs to definitions and some relationships including subsumption relations so that they form a hierarchy. The URIs serve as the tags for these annotations. 

The format of the CRAFT semantic annotations is Knowtator XML, but we will convert these to Knowtator 2 XML.

### Structural annotations
Structural annotations consist of part-of-speech (POS) tags, treebank (dependency parses), and span/section tagging. Here, we will mostly be taking advantage of the dependency parses which define syntactic relations between tokens within a sentence. 

The format of the CRAFT syntactic annotations is PennTreebank, but we will convert these to ConllU.

In [None]:
%%bash
cd /media/tuh8888/Seagate\ Expansion\ Drive/data/craft-versions
git clone https://github.com/UCDenver-ccp/CRAFT.git
boot all-concepts -x convert -k
boot treebank convert -u

## Relation extraction

Now that we have some data in the correct formats, we can read it in.

In [7]:
(def home-dir
  (io/file "/" "media" "tuh8888" "Seagate Expansion Drive" "data"))

(def word-vector-dir
  (io/file home-dir "WordVectors"))
(def word2vec-db
  (.getAbsolutePath
    (io/file word-vector-dir "bio-word-vectors-clj.vec")))


(def craft-dir
  (io/file home-dir "craft-versions" "concepts+assertions_1_article"))
(def dependency-dir
  (io/file craft-dir "Structures"))
(def references-dir
  (io/file craft-dir "Articles"))
(def annotations-file
  (io/file craft-dir "concepts+assertions.knowtator"))

(def articles
  [(first (rdr/article-names-in-dir references-dir "txt"))])

#'beaker_clojure_shell_4090f0c3-a30e-4ec9-8e67-83060a9619fd/articles

In [13]:
(def references (rdr/read-references articles references-dir))
(def annotations (k/model annotations-file nil))
(.load annotations)
(def dependency (rdr/read-dependency word2vec-db articles references dependency-dir))
(def sentences (rdr/read-sentences annotations dependency articles))
(println "Num sentences:" (count sentences))

19-04-03 00:49:46 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.word2vec:50] - Word vector not found swmj
19-04-03 00:49:46 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.word2vec:50] - Word vector not found ey0
19-04-03 00:49:46 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.word2vec:50] - Word vector not found car0
19-04-03 00:49:46 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.word2vec:50] - Word vector not found mertts
19-04-03 00:49:46 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.word2vec:50] - Word vector not found c0bl
19-04-03 00:49:46 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.word2vec:50] - Word vector not found °
19-04-03 00:49:46 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.word2vec:50] - Word vector not found °
19-04-03 00:49:46 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.word2vec:50] - Word vector not found °
19-04-03 00:49:46 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.word2vec:50] - Word vector not found tctgaggatgttcacaggtttat
19-04-03 00:49:46 tuh8888-desktop DEBUG [edu.ucdenver

19-04-03 00:49:47 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.word2vec:50] - Word vector not found c0bl
19-04-03 00:49:47 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.word2vec:50] - Word vector not found c0bl
19-04-03 00:49:47 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.word2vec:50] - Word vector not found ±
19-04-03 00:49:47 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.word2vec:50] - Word vector not found ±
19-04-03 00:49:47 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.word2vec:50] - Word vector not found ±
19-04-03 00:49:47 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.word2vec:50] - Word vector not found ±
19-04-03 00:49:47 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.word2vec:50] - Word vector not found ±
19-04-03 00:49:47 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.word2vec:50] - Word vector not found ±
19-04-03 00:49:47 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.word2vec:50] - Word vector not found ±
19-04-03 00:49:47 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.word2vec:50] - Word vec

19-04-03 00:50:02 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 9
19-04-03 00:50:02 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 10
19-04-03 00:50:02 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 11
19-04-03 00:50:02 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 12
19-04-03 00:50:02 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 13
19-04-03 00:50:02 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 14
19-04-03 00:50:02 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 15
19-04-03 00:50:02 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 16
19-04-03 00:50:02 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 17
19-04-03 00:50:02 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 18
19-04-03 00:50:02 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 19
19-04-03 00:50:02 tuh8

19-04-03 00:50:13 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 14
19-04-03 00:50:13 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 15
19-04-03 00:50:13 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 16
19-04-03 00:50:13 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 17
19-04-03 00:50:13 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 18
19-04-03 00:50:13 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 19
19-04-03 00:50:13 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 20
19-04-03 00:50:13 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 0
19-04-03 00:50:15 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 1
19-04-03 00:50:16 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 2
19-04-03 00:50:16 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 3
19-04-03 00:50:16 tuh8888

19-04-03 00:50:24 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 13
19-04-03 00:50:24 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 14
19-04-03 00:50:24 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 0
19-04-03 00:50:24 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 1
19-04-03 00:50:24 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 2
19-04-03 00:50:24 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 0
19-04-03 00:50:26 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 1
19-04-03 00:50:27 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 2
19-04-03 00:50:28 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 3
19-04-03 00:50:28 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 4
19-04-03 00:50:28 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 5
19-04-03 00:50:28 tuh8888-desk

19-04-03 00:50:48 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 10
19-04-03 00:50:48 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 11
19-04-03 00:50:48 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 12
19-04-03 00:50:48 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 13
19-04-03 00:50:48 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 14
19-04-03 00:50:48 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 15
19-04-03 00:50:48 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 16
19-04-03 00:50:48 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 17
19-04-03 00:50:48 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 18
19-04-03 00:50:48 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 19
19-04-03 00:50:48 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 20
19-04-03 00:50:48 tuh

19-04-03 00:50:53 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 9
19-04-03 00:50:53 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 0
19-04-03 00:50:53 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 1
19-04-03 00:50:53 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 2
19-04-03 00:50:53 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 0
19-04-03 00:50:53 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 1
19-04-03 00:50:53 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 2
19-04-03 00:50:53 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 3
19-04-03 00:50:53 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 4
19-04-03 00:50:53 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 5
19-04-03 00:50:53 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 0
19-04-03 00:50:53 tuh8888-deskto

19-04-03 00:51:02 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 7
19-04-03 00:51:02 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 8
19-04-03 00:51:02 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 9
19-04-03 00:51:02 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 10
19-04-03 00:51:02 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 11
19-04-03 00:51:02 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 12
19-04-03 00:51:02 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 13
19-04-03 00:51:02 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 14
19-04-03 00:51:02 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 0
19-04-03 00:51:02 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 1
19-04-03 00:51:02 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 2
19-04-03 00:51:02 tuh8888-d

19-04-03 00:51:07 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 7
19-04-03 00:51:07 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 8
19-04-03 00:51:07 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 9
19-04-03 00:51:07 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 10
19-04-03 00:51:07 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 11
19-04-03 00:51:07 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 12
19-04-03 00:51:07 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 13
19-04-03 00:51:07 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 14
19-04-03 00:51:07 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 0
19-04-03 00:51:07 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 0
19-04-03 00:51:07 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 1
19-04-03 00:51:07 tuh8888-d

19-04-03 00:51:14 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 0
19-04-03 00:51:14 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 1
19-04-03 00:51:14 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 2
19-04-03 00:51:14 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 3
19-04-03 00:51:14 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 4
19-04-03 00:51:14 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 5
19-04-03 00:51:14 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 6
19-04-03 00:51:14 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 7
19-04-03 00:51:14 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 8
19-04-03 00:51:14 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 9
19-04-03 00:51:14 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 10
19-04-03 00:51:14 tuh8888-deskt

19-04-03 00:51:18 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 10
19-04-03 00:51:18 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 11
19-04-03 00:51:18 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 12
19-04-03 00:51:18 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 13
19-04-03 00:51:18 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 14
19-04-03 00:51:18 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 15
19-04-03 00:51:18 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 16
19-04-03 00:51:18 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 17
19-04-03 00:51:18 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 18
19-04-03 00:51:18 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 19
19-04-03 00:51:18 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 20
19-04-03 00:51:18 tuh

19-04-03 00:51:21 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 0
19-04-03 00:51:21 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 0
19-04-03 00:51:21 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 0
19-04-03 00:51:21 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 0
19-04-03 00:51:21 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 1
19-04-03 00:51:22 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 2
19-04-03 00:51:23 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 3
19-04-03 00:51:24 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 4
19-04-03 00:51:25 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 5
19-04-03 00:51:26 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 6
19-04-03 00:51:26 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 7
19-04-03 00:51:26 tuh8888-deskto

19-04-03 00:51:30 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 2
19-04-03 00:51:30 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 3
19-04-03 00:51:30 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 4
19-04-03 00:51:30 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 5
19-04-03 00:51:30 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 0
19-04-03 00:51:30 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 0
19-04-03 00:51:30 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 1
19-04-03 00:51:30 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 2
19-04-03 00:51:30 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 0
19-04-03 00:51:30 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 0
19-04-03 00:51:30 tuh8888-desktop DEBUG [edu.ucdenver.ccp.nlp.sentence:49] - Sentence 0
19-04-03 00:51:30 tuh8888-deskto

null

In [18]:
(let [property "has_location_in"
      seeds (set1/intersection
              (set (sentence/sentences-with-ann sentences "CRAFT_aggregate_ontology_Instance_21741"))
              (set (sentence/sentences-with-ann sentences "CRAFT_aggregate_ontology_Instance_21947")))
      seed-thresh 0.9
      context-thresh 0.9
      cluster-thresh 0.75
      min-support 20
      params {:seed             (first seeds)
              :seed-thresh      seed-thresh
              :context-thresh   context-thresh
              :seed-match-fn    #(and (concepts-match? %1 %2)
                                      (< seed-thresh (context-vector-cosine-sim %1 %2)))
              :context-match-fn #(< context-thresh (context-vector-cosine-sim %1 %2))
              :cluster-merge-fn add-to-pattern
              :cluster-match-fn #(let [score (context-vector-cosine-sim %1 %2)]
                                   (and (< (or %3 cluster-thresh) score)
                                        score))
              :min-support      min-support}]
  (->> (cluster-bootstrap-extract-relations seeds sentences params)
       (map #(merge % params))
       (map #(let [t (evaluation/matched-triples % annotations property)]
               (assoc % :num-matches (count t) :triples t)))
       (evaluation/format-matches)))
                 

19-04-03 00:57:01 tuh8888-desktop INFO [edu.ucdenver.ccp.nlp.relation-extraction:53] - Seeds 8
19-04-03 00:57:01 tuh8888-desktop INFO [edu.ucdenver.ccp.nlp.relation-extraction:54] - Patterns 0
19-04-03 00:57:01 tuh8888-desktop INFO [edu.ucdenver.ccp.nlp.relation-extraction:63] - New matches 7
19-04-03 00:57:01 tuh8888-desktop INFO [edu.ucdenver.ccp.nlp.relation-extraction:53] - Seeds 15
19-04-03 00:57:01 tuh8888-desktop INFO [edu.ucdenver.ccp.nlp.relation-extraction:54] - Patterns 0
19-04-03 00:57:01 tuh8888-desktop INFO [edu.ucdenver.ccp.nlp.relation-extraction:63] - New matches 7
19-04-03 00:57:01 tuh8888-desktop INFO [edu.ucdenver.ccp.nlp.relation-extraction:53] - Seeds 19
19-04-03 00:57:01 tuh8888-desktop INFO [edu.ucdenver.ccp.nlp.relation-extraction:54] - Patterns 0
19-04-03 00:57:01 tuh8888-desktop INFO [edu.ucdenver.ccp.nlp.relation-extraction:63] - New matches 4
19-04-03 00:57:01 tuh8888-desktop INFO [edu.ucdenver.ccp.nlp.relation-extraction:53] - Seeds 20
19-04-03 00:57:01 tu

In [None]:
(println "Final matches: " (count matches))
(println "Triples matched" (count (distinct (mapcat :triples matches))))

## Initialization cells

In [4]:
%classpath config resolver mvnLocal
%classpath add mvn edu.ucdenver.ccp knowtator  2.1.6
%classpath add mvn org.clojure clojure 1.10.0
%classpath add mvn com.google.cloud google-cloud-bigquery 1.64.0
%classpath add mvn uncomplicate neanderthal 0.22.0
%classpath add mvn org.slf4j slf4j-simple 1.7.26
%classpath add mvn com.taoensso nippy 2.14.0
%classpath add mvn com.climate claypoole 1.1.4
%classpath add mvn com.taoensso timbre 4.10.0
%classpath add mvn org.clojure math.combinatorics 0.1.4
%classpath add mvn spicerack spicerack 0.1.6

In [17]:
(load-file "util.clj")
(load-file "edu/ucdenver/ccp/clustering.clj")
(load-file "edu/ucdenver/ccp/nlp/relation_extraction.clj")
(load-file "edu/ucdenver/ccp/knowtator_clj.clj")
(load-file "edu/ucdenver/ccp/conll.clj")
(load-file "edu/ucdenver/ccp/nlp/sentence.clj")
(load-file "edu/ucdenver/ccp/nlp/word2vec.clj")
(load-file "edu/ucdenver/ccp/nlp/readers.clj")
(load-file "edu/ucdenver/ccp/nlp/evaluation.clj")

#'edu.ucdenver.ccp.nlp.evaluation/parameter-walk

In [6]:
(require '[edu.ucdenver.ccp.nlp.relation-extraction :refer :all]
         '[clojure.java.io :as io]
         '[edu.ucdenver.ccp.nlp.readers :as rdr]
         '[edu.ucdenver.ccp.clustering :refer [single-pass-cluster]]
         '[edu.ucdenver.ccp.nlp.evaluation :as evaluation]
         '[edu.ucdenver.ccp.knowtator-clj :as k]
         '[util :refer [cosine-sim]]
         '[clojure.set :as set1]
         '[edu.ucdenver.ccp.nlp.sentence :as sentence])

null