# 1. Assessor and analyst work

## 1.0. Rating and criteria

Please [open this document](https://static.googleusercontent.com/media/guidelines.raterhub.com/en//searchqualityevaluatorguidelines.pdf)
and study chapters 13.0-13.4. Your task will be to assess the organic answers of search engines given the same query.

## 1.1. Explore the page

For the following search engines:
- https://duckduckgo.com/
- https://www.bing.com/
- https://ya.ru/
- https://www.google.com/

Perform the same query: "**How to get from Kazan to Voronezh**".

Discuss with your TA the following:
1. Which elements you may identify at SERP? Ads, snippets, blends from other sources, ...?
2. Where are organic results? How many of them are there?

## 1.2. Rate the results of the search engine

If there are many of you in the group, assess all search engines, otherwise choose 1 or 2. There should be no less than 5 of your for each search engine. Use the scale from the handbook, use 0..4 numerical equivalents for `[FailsM, SM, MM, HM, FullyM]`. 

Compute:
- average relevance and standard deviation for each SERP element.
- [Fleiss kappa score](https://en.wikipedia.org/wiki/Fleiss%27_kappa#Worked_example) for your group. Use [this implementation](https://www.statsmodels.org/dev/generated/statsmodels.stats.inter_rater.fleiss_kappa.html).
- [Kendall rank coefficient](https://en.wikipedia.org/wiki/Kendall_rank_correlation_coefficient) for some pairs in your group. Use [this implementation](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.kendalltau.html).

Discuss numerical results. Did you agree on the relevance? Did you agree on the rank? What is the difference?

In [4]:
import numpy as np

# example input by users
ranking_data = np.array([
	[4, 4, 4, 3, 4, 2, 2, 1, 1, 0],  # assessor 1 relevance
	[4, 3, 4, 3, 3, 2, 1, 1, 1, 1],  # assessor 2 relevance
	[3, 4, 4, 4, 4, 3, 2, 1, 1, 1],  # ...
	[4, 4, 4, 4, 3, 2, 2, 1, 1, 0],
	[4, 4, 4, 4, 3, 2, 2, 1, 1, 3]
])

Averages ang standard deviations per item.

In [5]:
# TODO your code here

Fleiss kappa score

In [6]:
!pip install statsmodels


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [13]:
from statsmodels.stats.inter_rater import aggregate_raters, fleiss_kappa

aggregation, _ = aggregate_raters(ranking_data.T)
kappa = fleiss_kappa(aggregation)

aggregation, kappa

(array([[0, 0, 0, 1, 4],
        [0, 0, 0, 1, 4],
        [0, 0, 0, 0, 5],
        [0, 0, 0, 2, 3],
        [0, 0, 0, 3, 2],
        [0, 0, 4, 1, 0],
        [0, 1, 4, 0, 0],
        [0, 5, 0, 0, 0],
        [0, 5, 0, 0, 0],
        [2, 2, 0, 1, 0]]),
 0.5156081808396124)

Kendall tau score is pairwise. Compare one to another.

In [33]:
from scipy.stats import kendalltau

kendalltau(
	ranking_data[0],
	ranking_data[1],
)

SignificanceResult(statistic=0.8336550215650926, pvalue=0.0031006074932690315)

# 2. Engineer work

You will create a bucket of URLs which are relevant for the query **"free cloud git"**. Then you will automate the search procedure using https://serpapi.com/, or https://developers.google.com/custom-search/v1/overview, or whatever.

Then you will compute MRR@10 and Precision@10.

## 2.1. Build your bucket here

In [69]:
import requests

rel_bucket = [
	"github.com",
	"gitlab.com",
	"bitbucket.org"
]

resp = requests.get(
	"https://serpapi.com/search.json?q=git+free+cloud&location=Austin,+Texas,+United+States&hl=en&gl=us&google_domain=google.com&key=5aff1ae53da3a991a97d770bf1991833ba30a97d68925ede4cb0003285c727ba"
)

resp.raise_for_status()
resp = resp.json()

organic = resp["organic_results"]
organic = [i["link"] for i in organic]
organic

['https://www.reddit.com/r/git/comments/46t07s/best_free_git_hosting/',
 'https://github.com/cloudcommunity/Free-Hosting',
 'https://github.com/cloudcommunity/Cloud-Free-Tier-Comparison',
 'https://www.gitpod.io/',
 'https://bitbucket.org/product',
 'https://opensource.com/article/18/8/github-alternatives',
 'https://www.quora.com/Whats-stopping-people-from-using-GitHub-for-free-unlimited-cloud-storage-instead-of-Dropbox',
 'https://git-scm.com/',
 'https://azure.microsoft.com/en-us/products/devops/repos',
 'https://source.cloud.google.com/']

## 2.2. Relevance assessment

Write the code to check that the obtained document is relevant (True) or not (False).

In [89]:
from urllib.parse import urlparse


def is_rel(resp_url):
	netloc = urlparse(resp_url).netloc
	rel = netloc in rel_bucket
	return rel

## 2.3. Automation

Get search results from the automation tool you use.

In [90]:
rels = [is_rel(link) for link in organic]
rels

[False, True, True, False, True, False, False, False, False, False]

## 2.4. MRR

Compute MRR:

In [91]:
def mrr(mat):
	rows, cols = mat.shape
	rank = np.zeros((rows))
	
	for j, row in enumerate(mat):
		rel_pos = None
		for i, x in enumerate(row):
			if x:
				rel_pos = i + 1
				break
				
		if rel_pos:
			rank[j] = rel_pos

	rank = np.where(rank == 0, 0, 1 / rank)
	mrr = np.mean(rank)
	return mrr

In [92]:
rels = np.array([rels])
rels

array([[False,  True,  True, False,  True, False, False, False, False,
        False]])

In [93]:
mrr(rels)

0.5

## 2.5. Precision
Compute mean precision:

In [94]:
def mp(mat):
	per_query = np.mean(mat, axis=1)
	mp = np.mean(per_query)
	return mp

In [95]:
mp(rels)

0.3