> # First year Computer Science project - Thomas Aujoux - Grégoire Brugère
    
\
\
**Welcome on the application : Find My Friends !!**     
\
The IP-Paris campus brings together many students with very different interests and tastes. At the beginning of the year, it's very difficult to meet people who enjoy the same activities as you. With the help of the BDE, we decided to create an application that will enable us to create groups of friends with the same passions for the start of the 1A year next year. This application is called **Find My Friends**.

\
All you have to do is **rate your activities from 0 to 10** (0 for an activity you hate and 10 for an activity you love). The application will send you directly the people with whom **you are most likely to enjoy the same activities**!

\
This project will be divided into two parts:


* The first part involves creating the matching algorithm.
* The second part, which is used to create the graphical interface to simplify the user experience.

\
Note: unit tests come just after the code they test.

---



> # First part : matching algorithm



# I) Creation of the "fake users" database

\
Firstly, we are going to create a database of different users. Each column will represent a centre of interest and each row a user. 

Given that we will then reduce the size of the database (APC), we can be completely exhaustive in terms of the amount of data!

As this year's theme is not scrapping (finding information on the internet), we're going to create the user profiles ourselves. To do this, we're going to proceed in two stages

Creating the database:

* We will set up categories ("Writing", "Suchi", "Hockey", "Theatre", "Sport at home" ...) filled in from 0 to 10, the "fake users" will put a note from 0 to 10 for 0 an activity that we do not like and 10 an activity that we love to do.

* Then, there will be "Bios" that will allow "fake users" to add passions that are not present in the list. 



# I) A) Setting up categories


We create the categories:



* We import the libraries used, and define the number of "false individuals" that will be present in our database.

* We define the categories that will be the columns of our database. As we will be reducing the dimensions later, we can define as many as we like! The categories chosen are taken from the most common passions of the French.

* We fill in the different columns with random numbers from 0 to 10, 0 for an individual who doesn't like this activity at all, 10 for an individual who loves this activity.

* Then, to carry out a PCA, we need to put the data on the same scale.

* Finally, we'll run a unit test to see if all the data in the table are between 0 and 1.


In [2]:
import pandas as pd
import numpy as np

n=10000
#categories = ["Informatique","Sport","Cinéma","mode", "Ecriture", "Suchi", "Hockey", "Théâtre","Sport à la maison","hackaton", "Mangas", "Sneakers", "Maquillage", "Instagram", "Arts martiaux", "Marvel", "Marche à pied", "Course à pied", "voyage", "Discussion dans une autre langue", "Réseaux sociaux", "Cosmétique", "Skateboard", "Cuisine végane", "Photographie", "Lecture", "Chant", "Volleyball", "Sports", "Vinyasa", "Café","League of Legends","Karaoké","Fortnite","Plongée en apnée","Itinéraire gourmand","Statistiques","Mathématiques","Natation","Tennis","Rugby","escalade","bowling","course"]
#We have opted for a list of simplified categories, making the matching less specific but making 
#the programme simpler and more effective on a smaller scale. 
categories = ["Computer Science","Sport","Cinema","Fashion","Litterature", "Gastronomie", "Hockey", "Theater", "Hackaton"]
len(categories)

9

We have imported the necessary modules. We've also defined the number of individuals we want to create and the categories we want our application to include.

In [3]:
def remplir(categorie,nombre_indiv):
  dat = pd.DataFrame(columns = categorie)
  for i in dat.columns:
    dat[i] = np.random.randint(0,10,nombre_indiv)
  return dat

data = remplir(categories,n)
data



Unnamed: 0,Computer Science,Sport,Cinema,Fashion,Litterature,Gastronomie,Hockey,Theater,Hackaton
0,4,1,1,9,3,5,9,7,8
1,5,7,8,9,5,5,8,5,2
2,1,8,7,9,2,2,2,3,4
3,2,5,5,7,1,8,5,0,4
4,9,4,4,8,9,9,9,8,6
...,...,...,...,...,...,...,...,...,...
9995,1,6,2,1,3,7,5,9,1
9996,7,4,2,1,0,5,7,6,9
9997,6,6,8,6,5,5,6,5,9
9998,7,3,2,4,0,1,7,6,3


We've created a function based on several categories: we currently have 42 categories and 1000 individuals who have given their opinions.

As we're going to be doing a clustering algorithm next, with so much data it can take a long time to do. Scaling the data will reduce the execution time. To do this, we're going to use a classic sklearn command.

In [4]:
from sklearn.preprocessing import MinMaxScaler

def echelle_fn(dat, categorie):
    dat = pd.DataFrame(MinMaxScaler().fit_transform(dat))
    dat = dat.set_axis(categorie, axis='columns')
    return dat

echelle1=echelle_fn(data,categories)
echelle1

Unnamed: 0,Computer Science,Sport,Cinema,Fashion,Litterature,Gastronomie,Hockey,Theater,Hackaton
0,0.444444,0.111111,0.111111,1.000000,0.333333,0.555556,1.000000,0.777778,0.888889
1,0.555556,0.777778,0.888889,1.000000,0.555556,0.555556,0.888889,0.555556,0.222222
2,0.111111,0.888889,0.777778,1.000000,0.222222,0.222222,0.222222,0.333333,0.444444
3,0.222222,0.555556,0.555556,0.777778,0.111111,0.888889,0.555556,0.000000,0.444444
4,1.000000,0.444444,0.444444,0.888889,1.000000,1.000000,1.000000,0.888889,0.666667
...,...,...,...,...,...,...,...,...,...
9995,0.111111,0.666667,0.222222,0.111111,0.333333,0.777778,0.555556,1.000000,0.111111
9996,0.777778,0.444444,0.222222,0.111111,0.000000,0.555556,0.777778,0.666667,1.000000
9997,0.666667,0.666667,0.888889,0.666667,0.555556,0.555556,0.666667,0.555556,1.000000
9998,0.777778,0.333333,0.222222,0.444444,0.000000,0.111111,0.777778,0.666667,0.333333




---

# Unit test 
The aim of the unit test is to check whether the data in the table is between 0 and 1. This will check whether our data scaling has worked.
As the number of operations is excessively large, we're going to look at the first 100 rows. Given that the variables are generated randomly, this poses no problem.



In [7]:

def test1_fn(echell):
  test = True 
  for i in range(100):
    for j in echell.columns:
      if echell.iloc[i][j]<0 or echell.iloc[i][j]>1:
        test = False
  return test

test1 = test1_fn(echelle1)
test1


True



---



# I) B) Creating biographies ("Bios")

\
We create the biographies:

Our aim will be to generate fake sentences to imitate real biographies created by users. The aim of these "Bios" is to give users a greater opportunity to express themselves by giving words that were not present in the categories. To do this, we will proceed in several stages:

* Each individual will be randomly assigned a word from a very large list of words (the list was found on the internet, these are sports beginning with the letter "A"). 
* To make the biographies more realistic, we're going to add "fake activities" ('the', 'la', 'les', 'je', 'j','') to the list of words to give the impression that they are real sentences. In reality, it would have been necessary to process the bios to make them more readable, as the word 'the' gives no information about the preference of an activity.
* We will first process the list of biographies by removing all capital letters and transforming them into lower case, then we will replace certain letters by others ("é" into "e", "è" into "e"...). This will simplify the biographies.

* The second simplification of the biographies will consist of separating all the words with spaces between them, to allow categories to appear. Next, we'll remove all the words we consider unnecessary, such as ('the', 'the', 'the', 'I', 'I','').

* As the words are random, it is possible that the only word present for a user is not usable. We will therefore assign random words to these users.

* Finally, we will format the activities in our database.

In [5]:
import random
L = ['le', 'la', 'les', 'je', 'j','',"Brunch","Fripes","Voguing","Couchsurfing","Mèmes","Happy-Hour","Moto","Investissement","Art","Randonnée","Montagnes","Backpachking","Pêche","Tennis","Glace","Pétanque","Patinoire","Expositions","Ski","Snowboard","Pilates","Broadway","Cheerleading","Chorale","Street-food","Accrobranche","Acrosport","Aerobic","Aéromodélisme","Aérostation","Agility","Aikido","Airsoft","Alpinisme","Apnée","Aquabike","Aquagym","Athlétisme","Aviation","Aviron"]

In [6]:
def biographie(L_,nombre_indiv):
  Bios = []
  for i in range(nombre_indiv):
    Bios.append(random.choice(L_))
  return Bios

Bios = biographie(L,n)
for i in range(0,15):
  print(Bios[i])

Aviation
les
Fripes
Aérostation
le
Apnée
Accrobranche
Aérostation
Randonnée
Randonnée
Aquabike
Aéromodélisme
la
Aerobic
le


"Bios" is the list of user biographies. It contains a large number of words taken from a list containing other "categories" that users could have entered themselves. In addition to these categories, there are words in the French language that will have to be removed.

In [9]:
import string as str

mot_inutile = ['I', 'a', 'an', 'with', 'which', '', 'the']
liste_remplacement = [['.', ' '], ["'", ' '], ['é', 'e'], ['è', 'e'], ['ê', 'e'], [':', '']]

We create a list of words that we are going to remove from the biographies because they do not express an opinion and some words are replaced by others to simplify processing afterwards.

In [10]:
def simplification(L_,nombre_indiv,liste_remplace):
  texte_simple=""
  texte_brut = biographie(L_,nombre_indiv) 
  for j in texte_brut:
    texte_simple = texte_simple+ " " + j.lower()
  for i in liste_remplace:
    texte_simple = texte_simple.replace(i[0], i[1])
  return(texte_simple)

simplification(L,n,liste_remplacement)

' couchsurfing peche agility apnee voguing investissement moto aviation expositions agility street-food les tennis pilates aerobic chorale  backpachking snowboard street-food j cheerleading accrobranche alpinisme chorale brunch art les la aviation aviation aquagym aeromodelisme broadway peche aviation couchsurfing backpachking aeromodelisme aviation aviation happy-hour patinoire happy-hour peche brunch montagnes petanque aquagym le art aviation agility apnee j aquabike brunch memes accrobranche agility aquagym art petanque agility aerostation apnee memes randonnee aquagym happy-hour happy-hour pilates cheerleading aerostation peche patinoire je investissement peche happy-hour j petanque aviron la j investissement memes le patinoire glace ski la aerobic backpachking aviation expositions randonnee  couchsurfing voguing peche ski  aerobic moto couchsurfing snowboard airsoft petanque tennis petanque expositions couchsurfing aviation aerobic brunch cheerleading montagnes peche memes art je 

The aim of this simplification of the biography is to remove the capital letters and replace certain words with others (the words are given in the replacement list).

In [11]:
def simplification2(L_,nombre_indiv, liste_remplace,mot_inutil):
    ltexte = (simplification(L_,nombre_indiv, liste_remplace)).split(' ')
    texte_toke = []
    for mot in ltexte:
        if mot not in mot_inutil:
            texte_toke.append(mot)
    return(texte_toke)

In [12]:
liste_mots = simplification2(L,n, liste_remplacement, mot_inutile)

for i in range(0,15):
  print(liste_mots[i])

expositions
aikido
chorale
peche
acrosport
moto
happy-hour
aviron
montagnes
petanque
moto
street-food
happy-hour
memes
broadway


Obviously, given that the biographies are generated using random words, we are not immune to the situation where one of the biographies is generated using only unnecessary words ('le', 'la', 'les', 'je', 'j',''), which we deleted earlier. The biography must therefore be filled in with categories.

In [13]:
I = ["Brunch","Fripes","Voguing","Couchsurfing","Mèmes","Happy-Hour","Moto","Investissement","Art","Randonnée","Montagnes","Backpachking","Pêche","Tennis","Glace","Pétanque","Patinoire","Expositions","Ski","Snowboard","Pilates","Broadway","Cheerleading","Chorale","Street-food","Accrobranche","Acrosport","Aerobic","Aéromodélisme","Aérostation","Agility","Aikido","Airsoft","Alpinisme","Apnée","Aquabike","Aquagym","Athlétisme","Aviation","Aviron"]

def biographie_finale(echell,L_,I_,nombre_indiv,liste_remplace,mot_inutil):
  Bios = biographie(L_,nombre_indiv)
  liste_mot = simplification2(L_,nombre_indiv,liste_remplace, mot_inutil)
  for i in range((nombre_indiv-len(liste_mot))):
    liste_mot.append(random.choice(I_))
  echell = echell.assign(Bios= liste_mot)
  return echell

data_nouv = biographie_finale(echelle1,L,I,n,liste_remplacement,mot_inutile)
data_nouv["Bios"]
data_nouv

Unnamed: 0,Computer Science,Sport,Cinema,Fashion,Litterature,Gastronomie,Hockey,Theater,Hackaton,Bios
0,0.444444,0.111111,0.111111,1.000000,0.333333,0.555556,1.000000,0.777778,0.888889,art
1,0.555556,0.777778,0.888889,1.000000,0.555556,0.555556,0.888889,0.555556,0.222222,aerobic
2,0.111111,0.888889,0.777778,1.000000,0.222222,0.222222,0.222222,0.333333,0.444444,agility
3,0.222222,0.555556,0.555556,0.777778,0.111111,0.888889,0.555556,0.000000,0.444444,aviation
4,1.000000,0.444444,0.444444,0.888889,1.000000,1.000000,1.000000,0.888889,0.666667,moto
...,...,...,...,...,...,...,...,...,...,...
9995,0.111111,0.666667,0.222222,0.111111,0.333333,0.777778,0.555556,1.000000,0.111111,Moto
9996,0.777778,0.444444,0.222222,0.111111,0.000000,0.555556,0.777778,0.666667,1.000000,Patinoire
9997,0.666667,0.666667,0.888889,0.666667,0.555556,0.555556,0.666667,0.555556,1.000000,Aquabike
9998,0.777778,0.333333,0.222222,0.444444,0.000000,0.111111,0.777778,0.666667,0.333333,Street-food


Before processing the database, you need to create new categories with biographies and complete the database for each user. This is called vectorisation!

In [14]:
from sklearn.feature_extraction.text import CountVectorizer
vectorizer = CountVectorizer()

def bio_vectoriser(echell,L_,I_,nombre_indiv):
  data_nouv = biographie_finale(echell,L_,I_,nombre_indiv,liste_remplacement,mot_inutile)
  vectorizer = CountVectorizer()
  tableau = vectorizer.fit_transform(data_nouv["Bios"])
  nouvelles_bios = pd.DataFrame(tableau.toarray(), columns=vectorizer.get_feature_names_out())
  df = pd.concat([data_nouv, nouvelles_bios], axis=1)
  df.drop('Bios', axis=1, inplace=True)
  return df

df = bio_vectoriser(echelle1,L,I,n)
df_copie = df
df


Unnamed: 0,Computer Science,Sport,Cinema,Fashion,Litterature,Gastronomie,Hockey,Theater,Hackaton,accrobranche,...,pilates,pétanque,pêche,randonnee,randonnée,ski,snowboard,street,tennis,voguing
0,0.444444,0.111111,0.111111,1.000000,0.333333,0.555556,1.000000,0.777778,0.888889,0,...,0,0,0,0,0,0,0,0,0,0
1,0.555556,0.777778,0.888889,1.000000,0.555556,0.555556,0.888889,0.555556,0.222222,0,...,0,0,0,0,0,0,0,0,0,0
2,0.111111,0.888889,0.777778,1.000000,0.222222,0.222222,0.222222,0.333333,0.444444,0,...,0,0,0,0,0,0,0,0,0,0
3,0.222222,0.555556,0.555556,0.777778,0.111111,0.888889,0.555556,0.000000,0.444444,0,...,0,0,0,0,0,0,0,0,0,0
4,1.000000,0.444444,0.444444,0.888889,1.000000,1.000000,1.000000,0.888889,0.666667,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
9995,0.111111,0.666667,0.222222,0.111111,0.333333,0.777778,0.555556,1.000000,0.111111,0,...,0,0,0,0,0,0,0,0,0,0
9996,0.777778,0.444444,0.222222,0.111111,0.000000,0.555556,0.777778,0.666667,1.000000,0,...,0,0,0,0,0,0,0,0,0,0
9997,0.666667,0.666667,0.888889,0.666667,0.555556,0.555556,0.666667,0.555556,1.000000,0,...,0,0,0,0,0,0,0,0,0,0
9998,0.777778,0.333333,0.222222,0.444444,0.000000,0.111111,0.777778,0.666667,0.333333,0,...,0,0,0,0,1,0,0,0,0,0


# II) Dimension reduction

\
**General idea:**

We now have a very large number of columns. We need to reduce the size of our database to make the algorithm more efficient.
This method will make it possible to retain a large proportion of the statistical information while considerably reducing the time taken by the algorithm.

The principle is simple: it involves summarising the information contained in a large database into a number of synthetic variables called Principal Components. 

The idea is then to be able to project these data onto the nearest hyperplane in order to have a simple representation of our data.
Obviously, when you reduce dimensions, you lose information. That's what Principal Component Analysis is all about. We have chosen to keep 90% of the information.

\
**The various stages:**


* First, we display the variance explained as a function of the number of eigenvalues used. We note that often, once a certain number of eigenvalues has been reached, adding another eigenvalue does not add much information.

* We will perform a PCA based on the percentage of "variance explained" found in the previous step.

In [16]:
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt

This graph allows us to understand how many eigenvectors explain a proportion of the variance, and therefore of the information. With a certain number of "Components" it is possible to explain a certain percentage of the variance.

In [22]:
def pca_fn(df,n):
  pca=PCA(n)
  df_pca = pca.fit_transform (df)
  pca = PCA(n_components=pca.n_components_)
  df_pca = pca.fit_transform(df)
  return df_pca

df_pca = pca_fn(df,0.97)
#df_pca_copy=df_pca
df_pc = pca_fn(df,0.97)

PCA enabled us to reduce the number of principal components or functionalities. Given that our project is based on a randomly generated database, the results are not always the same. On the other hand, on the first graph we can see that the curve is less and less steep and that from 0.97 variance explained adding a category doesn't add much information compared to the categories before it. We therefore decided to carry out our PCA keeping 0.97% variance explained. 
This allows us to remove around 15 categories (depending on whether the database is randomly generated). 

# III) Cluster creation


**General idea:**

Now that our database is ready, we can create clusters. We have chosen to use a hierarchical clustering algorithm because we know exactly how many clusters we want, given that we want groups of 5 people.

**The different stages:**

The principle of the hierarchical method is very simple, it forms step-by-step connections between individuals and uses a distance matrix to find the cluster closest to another. 

* First step: for each individual, find the other individual who is closest and form groups of 2 from the population.

* Second step: the distance matrix needs to be updated by removing one cell, due to the grouping of two individuals. Then, depending on the aggregation strategy, we identify the group closest to the first and then merge them. 

* The algorithm is repeated until the right number of clusters is obtained.

In [23]:
from sklearn.cluster import AgglomerativeClustering

def clusteriser(dat,df_pc,nombre_cluste,L_,I_,nombre_indiv):

  cluster = AgglomerativeClustering(n_clusters=nombre_cluste)  
  cluster.fit(df_pc)
  numeros_cluster = cluster.labels_
  biographie_fi = biographie_finale(dat,L_,I_,nombre_indiv,liste_remplacement,mot_inutile)

  biographie_fi['Cluster #'] = numeros_cluster
  return biographie_fi

nombre_cluster = int(n/10)
data_final = clusteriser(data,df_pc, nombre_cluster,L,I,n)
data_final

#biographie_finale(echell,L_,I_,nombre_indiv,liste_remplace,mot_inutil)

Unnamed: 0,Computer Science,Sport,Cinema,Fashion,Litterature,Gastronomie,Hockey,Theater,Hackaton,Bios,Cluster #
0,4,1,1,9,3,5,9,7,8,voguing,9
1,5,7,8,9,5,5,8,5,2,cheerleading,845
2,1,8,7,9,2,2,2,3,4,les,555
3,2,5,5,7,1,8,5,0,4,aquabike,139
4,9,4,4,8,9,9,9,8,6,chorale,146
...,...,...,...,...,...,...,...,...,...,...,...
9995,1,6,2,1,3,7,5,9,1,Pétanque,193
9996,7,4,2,1,0,5,7,6,9,Chorale,526
9997,6,6,8,6,5,5,6,5,9,Snowboard,304
9998,7,3,2,4,0,1,7,6,3,Fripes,436


In [24]:
def trouver_cluster(data_fi,num):
  df_mask=data_fi['Cluster #'] == num
  filtered_df = data_fi[df_mask]
  return(filtered_df)

pd.DataFrame(trouver_cluster(data_final,50))



Unnamed: 0,Computer Science,Sport,Cinema,Fashion,Litterature,Gastronomie,Hockey,Theater,Hackaton,Bios,Cluster #
587,1,1,3,3,5,5,2,1,4,memes,50
1951,1,3,8,7,7,3,3,2,3,airsoft,50
2472,2,1,4,4,7,4,0,2,5,glace,50
2970,8,1,5,9,9,2,1,0,5,glace,50
3022,1,3,4,8,9,4,0,4,0,aerostation,50
3352,1,4,8,9,9,5,3,1,3,aikido,50
3741,3,0,5,6,8,2,1,0,1,voguing,50
3978,0,1,8,4,9,9,2,3,5,voguing,50
4409,1,1,9,5,9,6,4,5,4,athletisme,50
6671,6,0,6,4,9,1,0,1,3,glace,50




---
# Unit test

The aim of this test is to check whether two individuals with the same characteristics will be assigned to the same cluster. We will therefore create a copy of an individual and put it back into the database. The test will be verified if the two individuals are in the same cluster.


In [25]:
def creer_cluster(categorie,nombre_indiv,indiv,L_,I_,nombre_cluste):
  dat = remplir(categorie,nombre_indiv)
  #nouv_ligne = pd.DataFrame([individu],columns = categorie)
  #dat = pd.DataFrame(dat.concat([dat, nouv_ligne], ignore_index=True))
  dat.loc[nombre_indiv-1]= indiv
  z= echelle_fn(dat,categorie)
  df = bio_vectoriser(z,L_,I_,nombre_indiv)
  df_pca2 = pca_fn(df,0.97)
  data_fin = clusteriser(dat,df_pca2,nombre_cluste,L_,I_,nombre_indiv)
  print(data_fin)
  numero = data_fin["Cluster #"].loc[nombre_indiv-1]
  print(numero)
  return trouver_cluster(data_fin,numero)

In [26]:
a =creer_cluster(categories,n,[5]*9,L,I,nombre_cluster)
a

      Computer Science  Sport  Cinema  Fashion  Litterature  Gastronomie  \
0                    2      5       1        5            9            7   
1                    7      2       3        9            4            7   
2                    5      8       5        5            5            7   
3                    9      0       9        6            5            3   
4                    2      3       0        7            4            4   
...                ...    ...     ...      ...          ...          ...   
9995                 7      8       1        1            6            9   
9996                 0      8       0        8            4            5   
9997                 2      9       9        5            7            7   
9998                 7      1       9        0            2            1   
9999                 5      5       5        5            5            5   

      Hockey  Theater  Hackaton         Bios  Cluster #  
0          2        0        

Unnamed: 0,Computer Science,Sport,Cinema,Fashion,Litterature,Gastronomie,Hockey,Theater,Hackaton,Bios,Cluster #
995,8,6,7,5,5,4,4,3,7,street-food,394
1004,5,4,9,0,2,1,7,2,5,cheerleading,394
1552,8,6,8,5,4,1,6,5,7,patinoire,394
1607,0,8,4,0,3,6,2,0,9,les,394
1663,4,7,0,0,3,3,5,2,6,ski,394
3826,4,3,9,3,0,3,6,4,9,snowboard,394
4323,6,6,8,3,2,2,7,0,8,peche,394
4407,1,4,3,0,0,1,7,1,6,aikido,394
4985,6,4,4,1,4,2,6,0,9,ski,394
5751,0,6,7,0,5,1,8,2,8,ski,394




---


> # Part two: Creating the interface

# I)



In [None]:
from tkinter import *
from tkinter import messagebox as mb


class Quiz:
	def __init__(self):
		self.qno=0 #Cette acumulateur indique quelle est le numér de la question traitée
		self.disp_title() #Fonction qui va permettre d'afficher le titre de test
		self.disp_ques() #Fonction qui va permettre d'afficher les questions
		self.opt_sel=IntVar() #Variable indiquant quelle réponse a été selectionné
		self.opts=self.radio_buttons() #Cela concerne l'affichage des réponses possibles, qui seront faites sous la forme de "radiobuttons"
		self.disp_opt() #Fonction permettant d'afficher les réponses possibles
		self.buttons() #Cette fonction sert à afficher les boutons de validation des questions et d'arrêt du test
		self.total_size=len(question)
		self.recup= [] #Cette liste servira à récupérer les réponses entrées par l'utilisateur

	def disp_res(self):
		
		result = f"Results: {creer_cluster(categories,n,self.recup,L,I,nombre_cluster)}" #à noter qu'afin de rendre le test plus court les autres questions ont été répondu à 0.5
		
		mb.showinfo("Results :", f"{result}")
        

	def next_btn(self):
		(self.recup).append(self.opt_sel.get() -1) #On insére la valeur entré par l'utilisateur dans la liste recup
		self.qno += 1
		
		if self.qno==self.total_size: #Cette condition implique que toutes les questions ont été répondu, nous pouvons passé au résultat
			self.disp_res()
			ws.destroy()
		else:
			self.disp_ques()
			self.disp_opt()


	def buttons(self):
		
		next_button = Button(
            ws, 
            text="Next",
            command=self.next_btn,
            width=10,
            bg="#F2780C",
            fg="white",
            font=("ariel",16,"bold")
            )
		
		next_button.place(x=350,y=380)
		
		quit_button = Button(
            ws, 
            text="Quit", 
            command=ws.destroy,
            width=5,
            bg="black", 
            fg="white",
            font=("ariel",16," bold")
            )
		
		quit_button.place(x=700,y=50)   #Cette fonction définit globalement le design des boutons "QUIT" et "Suivant" et leurs commandes


	def disp_opt(self):
		val=0
		self.opt_sel.set(0)
		
		for option in options[self.qno]:
			self.opts[val]['text']=option #On parcourt la liste des options afin de toutes les affichées
			val+=1

	def disp_ques(self):
		
		qno = Label(
            ws, 
            text=question[self.qno], 
            width=60,
            font=( 'ariel' ,16, 'bold' ), 
            anchor= 'w',
			wraplength=700,
			justify='center'
            )
		
		qno.place(x=70, y=100)


	def disp_title(self):
		
		title = Label(
            ws, 
            text="Compatibility Test",
            width=50, 
            bg="#F2A30F",
            fg="white", 
            font=("ariel", 20, "bold")
            )
		
		title.place(x=0, y=2)


	def radio_buttons(self):
		
		q_list = []
		
		x_pos = 130
		
		while len(q_list) < 11: #Cette boucle while permet d'afficher les 11 boutons nécessaire à noter les catégories
			
			radio_btn = Radiobutton(
                ws,
                text=" ",
                variable=self.opt_sel,
                value = len(q_list)+1,
                font = ("ariel",18)
                )
			q_list.append(radio_btn)
			
			radio_btn.place(x = x_pos, y = 200)
			
			x_pos += 50
		
		return q_list

ws = Tk()

ws.geometry("800x450")

ws.title("Compatibility Test")
question = ["Your interest in computer science?", "Your interest in sport?", "Your interest in film?", "Your interest in fashion?", "Your interest in literature?", "Your interest in Japanese cuisine?", "Your interest in hockey?", "Your interest in theatre?", "Your interest in hackathons?"]
options = [['0','1','2','3','4','5','6','7','8','9','10']]*len(question)

quiz = Quiz()

ws.mainloop()