# Plotly Görsellestirme Kutuphanesi 


## Content:

1. Loading Data and Explanation of Features
1. Line Charts
1. Scatter Charts
1. Bar Charts
1. Pie Charts
1. Bubble Charts
1. Histogram
1. Word Cloud
1. Box Plot
1. Scatter Plot Matrix
1. Map Plots: https://www.kaggle.com/kanncaa1/time-series-prediction-with-eda-of-world-war-2
1. Data Visualization
     * Seaborn: https://www.kaggle.com/kanncaa1/seaborn-for-beginners
     * Bokeh 1: https://www.kaggle.com/kanncaa1/interactive-bokeh-tutorial-part-1
     * Bokeh 2: https://www.kaggle.com/kanncaa1/interactive-bokeh-tutorial-part-
     * Rare Visualization: https://www.kaggle.com/kanncaa1/rare-visualization-tools
1. Inset Plots
1. 3D Scatter Plot with Colorscaling
1. Multiple Subplots
1. Earthquake Animation: https://www.kaggle.com/kanncaa1/earthquake-animation-with-plotly

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in 

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt 
import seaborn as sns 
#plotly import 
from plotly.offline import init_notebook_mode, iplot, plot
import plotly as py
init_notebook_mode(connected=True)
import plotly.graph_objs as go

#wordcloud library 
from wordcloud import WordCloud

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# Any results you write to the current directory are saved as output.

### 1-) Loading Data

> <font color ="blue">
* timesData includes 14 features that are:
    <font color ="green">
    * world_rank
    * university_name
    * country
    * teaching
    * international
    * research
    * citations
    * income
    * total_score
    * num_students
    * student_staff_ratio
    * international_students
    * female_male_ratio
    * year

In [None]:
#Datayı okuyalım 
timesdata=pd.read_csv("/kaggle/input/world-university-rankings/timesData.csv")

In [None]:
#Data içindeki featureları inceleyelim 
timesdata.info()

In [None]:
#Kısaca datayı inceleyelim
timesdata.head()

### 2-) Line Charts 
<font color ="red">
Plotly de kalıplara göre yaparız aşağıdaki kalıpları oluşturup çizim yaptırırız.
<font color ="black">
* Import graph_objs as go
* Creating traces
    * x = x axis
    * y = y axis
    * mode = type of plot like marker, line or line + markers
    * name = name of the plots
    * marker = marker is used with dictionary.
    * color = color of lines. It takes RGB (red, green, blue) and opacity (alpha)
    * text = The hover text (hover is curser)
* data = is a list that we add traces into it
* layout = it is dictionary.
    * title = title of layout
    * x axis = it is dictionary
    * title = label of x axis
    * ticklen = length of x axis ticks
    * zeroline = showing zero line or not
* fig = it includes data and layout
* iplot() = plots the figure(fig) that is created by data and layout

In [None]:
#2011 yılındaki Citation (Alıntı yapma) ve Teaching (Öğretim) karşılastırmasını yapalım 
#Önce datamızı hazırlayalım 
df = timesdata.iloc[:100,:] #Tüm sütunlar ve ilk 100 satır 
#Import işlemini yapalım 
import plotly.graph_objs as go 
#Şimdi çizdirilecek çizgi kadar trace olusturmalıyız 
#Creating Trace 1 
trace1 = go.Scatter(
                    x=df.world_rank, #X eksenimiz dünya sıralaması
                    y=df.citations,  #Y eksenimiz araştırma sayısı 
                    mode ="lines", #Çizgiler ile çizecek
                    name ="citations", #Çizimin ismi 
                    marker = dict(color ="rgba(150,18,23,0.6)"), #RGB red green blue demektir ve 0 ile 255 arası değer alır a ise alpha (saydamlık) 0 ile 1 arası değer alır 0 saydam 
                    text = df.university_name ) # Çizgi üstüne gelince üniversitenin ismini verecek 
trace2 = go.Scatter(
                    x=df.world_rank,
                    y=df.teaching,
                    mode ="lines+markers", #Çizgi ve noktalar ile çizdirecek 
                    name ="teaching",
                    marker =dict(color ="rgba(56,26,188,0.7)"),
                    text = df.university_name ) 
         
data = [trace1, trace2] #Trace 1 ve 2 yi tek listeye aldık 
layout = dict(title = "Citation and Teaching vs World Rank of Top 100 Universities",
xaxis = dict(title ="World Rank", ticklen = 5, zeroline = False))
fig =dict(data = data, layout = layout)
iplot(fig)
        
   



### 3-) Scatter Plot 
<font color ="red">
Plotly de kalıplara göre yaparız aşağıdaki kalıpları oluşturup çizim yaptırırız.
<font color ="black">
* Import graph_objs as go
* Creating traces
    * x = x axis
    * y = y axis
    * mode = type of plot like marker, line or line + markers
    * name = name of the plots
    * marker = marker is used with dictionary.
    * color = color of lines. It takes RGB (red, green, blue) and opacity (alpha)
    * text = The hover text (hover is curser)
* data = is a list that we add traces into it
* layout = it is dictionary.
    * title = title of layout
    * x axis = it is dictionary
    * title = label of x axis
    * ticklen = length of x axis ticks
    * zeroline = showing zero line or not
* fig = it includes data and layout
* iplot() = plots the figure(fig) that is created by data and layout

In [None]:
#Alıntı karşılastırması 2014 2015 2016 yılları için 
#Datasetlerimizi hazırlayalım 
df2014 = timesdata[timesdata.year == 2014].iloc[:100,:]
df2015 = timesdata[timesdata.year == 2015].iloc[:100,:] #İlk 100 değerlerini aldık 
df2016 = timesdata[timesdata.year == 2016].iloc[:100,:]

#import işlemi 
import plotly.graph_objs as go
#Traceleri olusturalım 
trace1 = go.Scatter(
        x=df2014.world_rank, 
        y=df2014.citations,
        mode = "markers",
        name = "2014",
        marker = dict(color ="rgba(255,128,255,0.8)"),
        text = df2014.university_name )

trace2= go.Scatter(
        x=df2015.world_rank,
        y=df2015.citations,
        mode = "markers",
        name = "2015",
        marker = dict(color ="rgba(255,128,2,0.8)"),
        text = df2015.university_name)

trace3 = go.Scatter(
        x=df2016.world_rank,
        y=df2016.citations,
        mode = "markers",
        name = "2016",
        marker = dict(color ="rgba(0,225,185,0.8)"),
        text = df2015.university_name )    

data = [trace1, trace2 ,trace3]
layout = dict(title = "Citation vs World Rank of Top 100 Universities with 2014 2015 and 2016 years",
xaxis = dict(title ="World Rank",ticklen =5, zeroline = False),
yaxis = dict(title ="Citation", ticklen =5, zeroline = False) 
             )
fig = dict(data = data, layout = layout)
iplot(fig)

### 4-) Bar Plot 
Scatter dan farkı go.Scatter değil go.Bar( ) şeklinde olmasıdır. Birkaç küçük fark daha var onları da örnekte görelim 



In [None]:
#First Bar Charts Example: citations and teaching of top 3 universities in 2014 (style1)
df2014 =timesdata[timesdata.year == 2014].iloc[:3,:]
import plotly.graph_objs as go
#Şimdi traceleri olusturalım Citations ve Teaching için 2 farklı trace olustacagız 
trace1 = go.Bar(
        x=df2014.university_name,
        y=df2014.citations,
        name = "Citations",
        marker = dict(color = "rgba(255,174,255,0.5)",
                line=dict(color= "rgba(0,0,0)",width =1.5)),
        text = df2014.country)

trace2 = go.Bar( 
        x = df2014.university_name,
        y = df2014.teaching,
        name = "teaching",marker = dict(color = 'rgba(255, 255, 128, 0.5)',
                line=dict(color='rgb(0,0,0)',width=1.5)), #Burada barların etrafındaki çizgiye renk ve kalınlık atadık 
        text = df2014.country) 
data = [trace1,trace2]
layout=go.Layout(barmode = "group") #Burada title vs de atabiliriz burada, barmode = group ile sütunların birleşik olmasını sağladık
fig = go.Figure(data = data, layout = layout)
iplot(fig)

In [None]:
#Aynı dataset ile farklı sekılde olusturalım. 
# prepare data frames
df2014 = timesdata[timesdata.year == 2014].iloc[:3,:]
# import graph objects as "go"
import plotly.graph_objs as go

#trace olusturalım 
trace1 = {
    "x": df2014.university_name,
    "y": df2014.citations,
    "name": "citation",
    "type": "bar"
}; 
trace2 = {
    "x": df2014.university_name,
    "y": df2014.teaching,
    "name": "teaching",
    "type": "bar"
};
data=[trace1, trace2];
layout = {
    "xaxis": {"title":"Top 3 Universities"},
    "barmode": "relative", #bu sekilde üst üste getirir ilk trace alta diğeri üstüne 
    "title": "Citations and Teaching of top 3 Universities in 2014"
};
fig = go.Figure(data=data, layout=layout)
iplot(fig)

In [None]:
# Third Bar Charts Example: Horizontal bar charts. (style3) Citation vs income for universities
#Yatay Bar Plot 


### 5-) Pie Charts 
Pie Charts Example: Students rate of top 7 universities in 2016


* fig: create figures
    * data: plot type
        * values: values of plot
        * labels: labels of plot
        * name: name of plots
        * hoverinfo: information in hover
        * hole: hole width
        * type: plot type like pie
* layout: layout of plot
    * title: title of layout
    * annotations: font, showarrow, text, x, y

In [None]:
print(df2016.info())
#Görüldüğü gibi num_students değişkeni string ama biz bu şekilde grafik olusturamayız 
df2016.head()
#Ve num_students değişkeninde 2.573 değil 2,573 şeklinde öğrenciler numaralandırılmıs ama bu bize hata verecektir onu da düzeltelim 
#Alt satırda düzeltmeler ve grafiği olusturalım

In [None]:
#Önce datayı hazırlayalım 
df2016 = timesdata[timesdata.year == 2016].iloc[:7,:]
pie1=df2016.num_students
pie_list = [float(each.replace(",","."))for each in df2016.num_students] #String i float a ve , ü nokta ile değiştirdik
labels = df2016.university_name 
#Burada tek bir trace oldugu için tek adımda yapacagız ve öğrendigimiz 2.metodu olan (bknz: bar plot style 2) dictionary olusturarak yapacagız 
fig = {
    "data" : [
        {
            "values": pie_list,
            "labels": labels,
            "domain": {"x":[0,.5]},
            "name": "Number of Student",
            "hoverinfo":"label+percent+name",
            "hole":.1,
            "type":"pie"
        },],
    "layout": {
        "title":"Universities Number of Students",
        "annotations":[
            {"font": {"size":20},
            "showarrow":False,
            "text": "Number of Students",
            "x":0.20,
            "y":1},
        ]
    }
}
iplot(fig)

 ### 6-) Buble Plot 
Buble plot ile Scatter plot temelde aynıdır sadece bu türde değerler ile renkler arasında bağıntı vardır.
Bubble Charts Example: University world rank (first 20) vs teaching score with number of students(size) and international score (color) in 2016


* x = x axis
* y = y axis
* mode = markers(scatter)
* marker = marker properties
    * color = third dimension of plot. Internaltional score
    * size = fourth dimension of plot. Number of students
* text: university names

In [None]:
df2016 = timesdata[timesdata.year == 2016].iloc[:20,:]
df2016

In [None]:
df2016 = timesdata[timesdata.year == 2016].iloc[:20,:]
num_students_size  = [float(each.replace(',', '.')) for each in df2016.num_students]
international_color = [float(each) for each in df2016.international]
data = [
    {
        "x":df2016.world_rank,
        "y":df2016.teaching,
        "mode":"markers",
        "marker": {
            "color":international_color, #Burada renk değişkeni olarak international değişkeninin büyüklüklerini kullanacağız
            "size":num_students_size, #Grafik büyüklüğü olarak öğrenci sayısının sınırlarını kullanıyoruz 
            "showscale":True #Ölçek gösterir
        },
        "text": df2016.university_name
    }
]
iplot(data)

Orneğin burada ilk sıradaki balonlar küçüktür çünkü öğrenci sayısını "size" olarak belirlemiştik ve 19.sıradaki University of Toronto nun öğrenci sayısı yüzünden daha büyük bir balona sahip oldugunu görebiliriz 

 ### 7-) Histogram 
 <font color='red'>
Lets look at histogram of students-staff ratio in 2011 and 2012 years. 
    <font color='black'>
* trace1 = first histogram
    * x = x axis
    * y = y axis
    * opacity = opacity of histogram
    * name = name of legend
    * marker = color of histogram
* trace2 = second histogram
* layout = layout 
    * barmode = mode of histogram like overlay. Also you can change it with *stack*
 ,

In [None]:
#Datayı hazırlayalım 
x2011 = timesdata.student_staff_ratio[timesdata.year == 2011] #Burada yılı 2011 olan staff ratioları çektik
x2012 = timesdata.student_staff_ratio[timesdata.year == 2012]
#trace olusturalım 
trace1 = go.Histogram(
x=x2011,
opacity = 0.75,
name ="2011",
marker=dict(color="rgba(171,50,96,0.6)"))

trace2= go.Histogram(
x=x2012,
opacity= 0.60,
name="2012",
marker=dict(color="rgba(12,65,188,0.6)"))

data=[trace1,trace2]
layout= go.Layout(barmode = "overlay", #Bu içiçe geçmesini sağlar
                 title="student-staff ratio in 2011 and 2012",
                 xaxis=dict(title="Students-Staff Ratio"),
                 yaxis=dict(title="Count"))
fig = go.Figure(data=data, layout=layout)
iplot(fig)

 ### 8-) Word Cloud 
 Not a pyplot but learning it is good for visualization. Lets look at which country is mentioned most in 2011.
* WordCloud = word cloud library that I import at the beginning of kernel
    * background_color = color of back ground
    * generate = generates the country name list(x2011) a word cloud

In [None]:
#Datayı hazırlayalım 
x2011 = timesdata.country[timesdata.year == 2011] #yılı 2011 olan ülkeleri çek 
plt.subplots(figsize =(8,8))
wordcloud = WordCloud(
                        background_color="white",
                        width=512,
                        height=384).generate(" ".join(x2011))
plt.imshow(wordcloud) #bu tarzın gösterimi bu komutla yapılır 
plt.axis("off")
plt.savefig("graph.png") #Bu kaggle sayfamızda görselin cıkmasına neden olur
plt.show()


 ### 9-) Box Plot
<font color='red'>
* Box Plots
    * Median (50th percentile) = middle value of the data set. Sort and take the data in the middle. It is also called 50% percentile that is 50% of data are less that median(50th quartile)(quartile)
        * 25th percentile = quartile 1 (Q1) that is lower quartile
        * 75th percentile = quartile 3 (Q3) that is higher quartile
        * height of box = IQR = interquartile range = Q3-Q1
        * Whiskers = 1.5 * IQR from the Q1 and Q3
        * Outliers = being more than 1.5*IQR away from median commonly.
        
    <font color='black'>
    * trace = box
        * y = data we want to visualize with box plot 
        * marker = color  

In [None]:
#Datayı hazırlayalım 
x2015 = timesdata[timesdata.year == 2015]

trace1 = go.Box(
    y=x2015.total_score,
    name="Total Score of Universities in 2015"),
    marker = dict(color="rgb()")