# Popular Data Science Questions

In this project, we explored the business context in which data science happens - to be more precise, on [Data Science Stack Exchange](https://datascience.stackexchange.com/). We're working for a company that creates data science content, be it books, online articles, videos or interactive text-based platforms.

We're tasked with figuring out what is the best content to write about. 

<div class="dq-px-8"><div id="body" class="dq-lesson-markup dq-max-w-screen-xl"><div><p>Stack Exchange hosts sites on a multitude of fields and subjects, including mathematics, physics, philosophy, and <a href="https://datascience.stackexchange.com/" target="_blank">data science</a>! Here's a sample of the most popular sites:</p>
<p><a href="https://stackexchange.com/sites?view=list#percentanswered" target="_blank"><img src="https://dq-content.s3.amazonaws.com/469/se_sites.png" alt="se_sites"></a></p>
<p>Stack Exchange employs a reputation award system for its questions and answers. Each post — each question/answer — is a post that is subject to upvotes and downvotes. This ensures that good posts are easily identifiable.</p>

<div class="dq-px-8"><div id="body" class="dq-lesson-markup dq-max-w-screen-xl"></div>
    
<div><p><p>After a spending some time investigating the website, We decide that the tags will be very useful in categorizing content.</p>
<p>Now comes the challenge of accessing the data <em>en masse</em>. One potential solution would be to <a href="https://en.wikipedia.org/wiki/Web_scraping" target="_blank">scrape</a> the site. Because we have an easier alternative than webscrap, we're going to do something else than that.</p>
<p>Stack Exchange provides a public data base for each of its websites. <a href="https://data.stackexchange.com/datascience/query/new" target="_blank">Here</a>'s a link to query and explore Data Science Stack Exchange's database. </p>
<p>You can read more about Stack Exchange Data Explorer (SEDE) on its <a href="https://data.stackexchange.com/help" target="_blank">help section</a> and on <a href="https://data.stackexchange.com/tutorial" target="_blank">this</a> tutorial link.</p>
<p></p><center>
<img src="https://dq-content.s3.amazonaws.com/469/dsde.png" alt="dsde">
</center><p></p>
<p>In the image above we can see the names of each of the tables in the database. Clicking on the names will expand to show the columns of each table.</p>
<p>The gif below shows how we can run the query <code>SELECT * FROM tags;</code>.</p>
<p><img src="https://dq-content.s3.amazonaws.com/469/run_query.gif" alt="run_query"></p>
<p>Note that SEDE uses a different dialect (<a href="https://en.wikipedia.org/wiki/Transact-SQL" target="_blank">Transact-SQL</a> — Microsoft's SQL) than SQLite. Most things are the same, but some are different. For instance, the query below selects the top 10 results from a query.</p>
</div><div class="react-codemirror2 dq-editor dq-mb-4 dq-p-2 dq-bg-gray-50 dark:dq-bg-gray-800"><div class="CodeMirror cm-s-default CodeMirror-wrap"><div style="overflow: hidden; position: relative; width: 3px; height: 0px; top: 48px; left: 205.6px;"><textarea autocorrect="off" autocapitalize="off" spellcheck="false" readonly="" tabindex="0" style="position: absolute; bottom: -1em; padding: 0px; width: 1000px; height: 1em; outline: none;"></textarea></div><div class="CodeMirror-vscrollbar" tabindex="-1" cm-not-content="true"><div style="min-width: 1px; height: 0px;"></div></div><div class="CodeMirror-hscrollbar" tabindex="-1" cm-not-content="true"><div style="height: 100%; min-height: 1px; width: 0px;"></div></div><div class="CodeMirror-scrollbar-filler" cm-not-content="true"></div><div class="CodeMirror-gutter-filler" cm-not-content="true"></div><div class="CodeMirror-scroll" tabindex="-1"><div class="CodeMirror-sizer" style="margin-left: 0px; margin-bottom: -16px; border-right-width: 34px; min-height: 80px; padding-right: 0px; padding-bottom: 0px;"><div style="position: relative; top: 0px;"><div class="CodeMirror-lines" role="presentation"><div role="presentation" style="position: relative; outline: none;"><div class="CodeMirror-measure"><pre class="CodeMirror-line-like"><span>xxxxxxxxxx</span></pre></div><div class="CodeMirror-measure"></div><div style="position: relative; z-index: 1;"></div><div class="CodeMirror-cursors"><div class="CodeMirror-cursor" style="left: 205.6px; top: 40px; height: 20px;">&nbsp;</div></div><div class="CodeMirror-code" role="presentation"><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">SELECT TOP 10 *</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">  FROM tags</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> ORDER BY Count DESC;</span></pre></div></div></div></div></div><div style="position: absolute; height: 34px; width: 1px; border-bottom: 0px solid transparent; top: 80px;"></div><div class="CodeMirror-gutters" style="display: none; height: 114px;"></div></div></div></div><div>
<p>In SQLite we would not only use the keyword <code>LIMIT</code> instead of <code>TOP</code> we would also included it at the end of the query, instead of in the <code>SELECT</code> statement.<a href="https://www.mssqltips.com/sqlservertip/4777/comparing-some-differences-of-sql-server-to-sqlite/" target="_blank"> Here</a>'s a helpful resource.</p></div></div></div>
<p>The result of the query mentioned above we display from csv using Pandas library.
    

In [None]:
import pandas as pd

query_result = pd.read_csv("QueryResults.csv")
query_result.head(20)

## Getting the Data
<p><p><div class="dq-px-8"><div id="body" class="dq-lesson-markup dq-max-w-screen-xl"><div><p>The posts table has a lot of columns. We'll be focusing our attention on those that seem relevant towards our goal:</p>
<ul>
<li><code>Id</code>: An identification number for the post.</li>
<li>
<p><code>PostTypeId</code>: An identification number for the type of post.</p>
<p><img src="https://dq-content.s3.amazonaws.com/469/PostTypes.png" alt="posttypes"></p>
</li>
<li>
<p><code>CreationDate</code>: The date and time of creation of the post.</p>
</li>
<li><code>Score</code>: The post's score.</li>
<li><code>ViewCount</code>: How many times the post was viewed.</li>
<li><code>Tags</code>: What tags were used.</li>
<li><code>AnswerCount</code>: How many answers the question got (only applicable to question posts).</li>
<li><code>FavoriteCount</code>: How many times the question was <a href="https://meta.stackexchange.com/questions/53585/how-do-favorite-questions-work" target="_blank">favored</a> (only applicable to question posts).</li>
</ul>
<p>Note that with the exception of the tags column, the last few columns contain information about how popular the post is — the kind of information we're after.</p>
<p>There are eight different types of post. Before we try to figure out which of them are relevant to us, let's check how many of them there are:</p>
</div><div class="react-codemirror2 dq-editor dq-mb-4 dq-p-2 dq-bg-gray-50 dark:dq-bg-gray-800"><div class="CodeMirror cm-s-default CodeMirror-wrap"><div style="overflow: hidden; position: relative; width: 3px; height: 0px; top: 48px; left: 205.6px;"><textarea autocorrect="off" autocapitalize="off" spellcheck="false" readonly="" tabindex="0" style="position: absolute; bottom: -1em; padding: 0px; width: 1000px; height: 1em; outline: none;"></textarea></div><div class="CodeMirror-vscrollbar" tabindex="-1" cm-not-content="true"><div style="min-width: 1px; height: 0px;"></div></div><div class="CodeMirror-hscrollbar" tabindex="-1" cm-not-content="true"><div style="height: 100%; min-height: 1px; width: 0px;"></div></div><div class="CodeMirror-scrollbar-filler" cm-not-content="true"></div><div class="CodeMirror-gutter-filler" cm-not-content="true"></div><div class="CodeMirror-scroll" tabindex="-1"><div class="CodeMirror-sizer" style="margin-left: 0px; margin-bottom: -16px; border-right-width: 34px; min-height: 80px; padding-right: 0px; padding-bottom: 0px;"><div style="position: relative; top: 0px;"><div class="CodeMirror-lines" role="presentation"><div role="presentation" style="position: relative; outline: none;"><div class="CodeMirror-measure"><pre class="CodeMirror-line-like"><span>xxxxxxxxxx</span></pre></div><div class="CodeMirror-measure"></div><div style="position: relative; z-index: 1;"></div><div class="CodeMirror-cursors" style=""><div class="CodeMirror-cursor" style="left: 205.6px; top: 40px; height: 20px;">&nbsp;</div></div><div class="CodeMirror-code" role="presentation"><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"><span class="cm-variable">SELECT</span> <span class="cm-variable">PostTypeId</span>, <span class="cm-variable">COUNT</span>(<span class="cm-operator">*</span>) <span class="cm-keyword">as</span> <span class="cm-variable">NrOfPosts</span></span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;<span class="cm-variable">FROM</span> <span class="cm-variable">posts</span></span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> <span class="cm-variable cm-error">GROUP</span> <span class="cm-variable">BY</span> <span class="cm-variable">PostTypeId</span>;</span></pre></div></div></div></div></div><div style="position: absolute; height: 34px; width: 1px; border-bottom: 0px solid transparent; top: 80px;"></div><div class="CodeMirror-gutters" style="display: none; height: 114px;"></div></div></div></div><div>
<table>
<tbody><tr>
<th>PostTypeId</th>
<th>NrOfPosts</th>
</tr>
<tr>
<td>1</td>
<td>21446</td>
</tr>
<tr>
<td>2</td>
<td>23673</td>
</tr>
<tr>
<td>4</td>
<td>236</td>
</tr>
<tr>
<td>5</td>
<td>236</td>
</tr>
<tr>
<td>6</td>
<td>11</td>
</tr>
<tr>
<td>7</td>
<td>1</td>
</tr>
</tbody></table>
<p>Due to their low volume, anything that isn't questions or answers is mostly inconsequential. Even if it happens to be the case that such kind of posts is immensely popular, they would just be outliers and not relevant to us. We'll then just focus on the questions.</p>
<p>Since we're only interested in recent posts, we'll limit our analysis to the posts of 2019. (At the time of writing it is early 2020). </p>
</div></div></div>
<p>

----------------------
The last thing that we need to point out here is the database schema. We were able to find it on the [Stack Exchange forum](https://meta.stackexchange.com/questions/2677/database-schema-documentation-for-the-public-data-dump-and-sede), so if this project will need further development, we will use it and refer to it.

Now, the first thing that we need to retrieve is a query against the SEDE DSSE database that extracts the columns listed above for all the questions (`id = 1`) in 2019. We will display output that we download as another `Pandas` dataframe.

The result of the query was stored in a file called `2019_questions.csv`. Here are the first few rows of the data we got:

In [None]:
query_result = pd.read_csv("2019_questions.csv")
query_result.head(14)

Looking at the of each row, it stands out that FavouriteCount has missing values. What other issues are there with the data? Let's explore it. We will:

<ol>
<li>Read in the file into a dataframe.</li>
<li>Explore the data. Try to answer a few of these questions in a markdown cell:<ul>
<li>How many missing values are there in each column?</li>
<li>Can we fix the missing values somehow?</li>
<li>Are the types of each column adequate?</li>
<li>What can we do about the <code>Tags</code> column?</li>
</ul>
</li>
</ol>

In [None]:
# We import everything that we'll use with Pandas:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline

In [None]:
questions = (
    pd.read_csv("2019_questions.csv",
                parse_dates=["CreationDate"])
)

questions.info() #How many missing values are there in each column?

In [None]:
# number of NaN's:

questions.isna().sum()

In [None]:
# Checking `Tags` object type:
questions["Tags"].apply(lambda value: type(value)).unique()

- We can see that `FavoriteCount` has 7432 instances of `NaN`. We can drop them or turn them into zeros. 
- All column types seems to be adequate. 
- `Tags`column is an object `type`. We can see that's a list with strings inside. It is possible to split each of them into separate columns.

## Cleaning the Data

<div id="body" class="dq-lesson-markup dq-max-w-screen-xl"><div><p>On the previous screen, we identified issues with the data. Fortunately for us, the folks at Stack Exchange did a great job of providing clean data. Let's fix the one issue we found, set the appropriate types for the columns, and clean the <code>Tags</code> column to fit our purposes.</p>
<p>At the end of this screen, the types of the columns should be as follows.</p>
</div><div class="react-codemirror2 dq-editor dq-mb-4 dq-p-2 dq-bg-gray-50 dark:dq-bg-gray-800"><div class="CodeMirror cm-s-default CodeMirror-wrap"><div style="overflow: hidden; position: relative; width: 3px; height: 0px; top: 128px; left: 301.6px;"><textarea autocorrect="off" autocapitalize="off" spellcheck="false" readonly="" tabindex="0" style="position: absolute; bottom: -1em; padding: 0px; width: 1000px; height: 1em; outline: none;"></textarea></div><div class="CodeMirror-vscrollbar" tabindex="-1" cm-not-content="true"><div style="min-width: 1px; height: 0px;"></div></div><div class="CodeMirror-hscrollbar" tabindex="-1" cm-not-content="true"><div style="height: 100%; min-height: 1px; width: 0px;"></div></div><div class="CodeMirror-scrollbar-filler" cm-not-content="true"></div><div class="CodeMirror-gutter-filler" cm-not-content="true"></div><div class="CodeMirror-scroll" tabindex="-1"><div class="CodeMirror-sizer" style="margin-left: 0px; margin-bottom: -16px; border-right-width: 34px; min-height: 160px; padding-right: 0px; padding-bottom: 0px;"><div style="position: relative; top: 0px;"><div class="CodeMirror-lines" role="presentation"><div role="presentation" style="position: relative; outline: none;"><div class="CodeMirror-measure"><pre class="CodeMirror-line-like"><span>xxxxxxxxxx</span></pre></div><div class="CodeMirror-measure"></div><div style="position: relative; z-index: 1;"></div><div class="CodeMirror-cursors"><div class="CodeMirror-cursor" style="left: 301.6px; top: 120px; height: 20px;">&nbsp;</div></div><div class="CodeMirror-code" role="presentation" style=""><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">Id &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;  int64</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">CreationDate &nbsp; &nbsp; datetime64[ns]</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">Score &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; int64</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">ViewCount &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; int64</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">Tags &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; object</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">AnswerCount &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; int64</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">FavoriteCount &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; int64</span></pre></div></div></div></div></div><div style="position: absolute; height: 34px; width: 1px; border-bottom: 0px solid transparent; top: 160px;"></div><div class="CodeMirror-gutters" style="display: none; height: 194px;"></div></div></div></div><div>
<p>The values in the <code>Tags</code> column are strings that look like this:</p>
</div><div class="react-codemirror2 dq-editor dq-mb-4 dq-p-2 dq-bg-gray-50 dark:dq-bg-gray-800"><div class="CodeMirror cm-s-default CodeMirror-wrap"><div style="overflow: hidden; position: relative; width: 3px; height: 0px; top: 27.6px; left: 349.6px;"><textarea autocorrect="off" autocapitalize="off" spellcheck="false" readonly="" tabindex="0" style="position: absolute; bottom: -1em; padding: 0px; width: 1000px; height: 1em; outline: none;"></textarea></div><div class="CodeMirror-vscrollbar" tabindex="-1" cm-not-content="true"><div style="min-width: 1px; height: 0px;"></div></div><div class="CodeMirror-hscrollbar" tabindex="-1" cm-not-content="true"><div style="height: 100%; min-height: 1px; width: 0px;"></div></div><div class="CodeMirror-scrollbar-filler" cm-not-content="true"></div><div class="CodeMirror-gutter-filler" cm-not-content="true"></div><div class="CodeMirror-scroll" tabindex="-1"><div class="CodeMirror-sizer" style="margin-left: 0px; margin-bottom: -16px; border-right-width: 34px; min-height: 60px; padding-right: 0px; padding-bottom: 0px;"><div style="position: relative; top: 0px;"><div class="CodeMirror-lines" role="presentation"><div role="presentation" style="position: relative; outline: none;"><div class="CodeMirror-measure"><pre class="CodeMirror-line-like"><span>xxxxxxxxxx</span></pre></div><div class="CodeMirror-measure"></div><div style="position: relative; z-index: 1;"></div><div class="CodeMirror-cursors"><div class="CodeMirror-cursor" style="left: 349.6px; top: 19.6px; height: 20.4px;">&nbsp;</div></div><div class="CodeMirror-code" role="presentation"><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"><span class="cm-string">"&lt;machine-learning&gt;&lt;regression&gt;&lt;linear-regression&gt;&lt;regularization&gt;"</span></span></pre></div></div></div></div></div><div style="position: absolute; height: 34px; width: 1px; border-bottom: 0px solid transparent; top: 60px;"></div><div class="CodeMirror-gutters" style="display: none; height: 94px;"></div></div></div></div><div>
<p>We'll want to transform this string in something more suitable to use typical string methods. Our goal will be to transform strings like the above in something like:</p>
</div><div class="react-codemirror2 dq-editor dq-mb-4 dq-p-2 dq-bg-gray-50 dark:dq-bg-gray-800"><div class="CodeMirror cm-s-default CodeMirror-wrap"><div style="overflow: hidden; position: relative; width: 3px; height: 0px; top: 27.6px; left: 253.6px;"><textarea autocorrect="off" autocapitalize="off" spellcheck="false" readonly="" tabindex="0" style="position: absolute; bottom: -1em; padding: 0px; width: 1000px; height: 1em; outline: none;"></textarea></div><div class="CodeMirror-vscrollbar" tabindex="-1" cm-not-content="true"><div style="min-width: 1px; height: 0px;"></div></div><div class="CodeMirror-hscrollbar" tabindex="-1" cm-not-content="true"><div style="height: 100%; min-height: 1px; width: 0px;"></div></div><div class="CodeMirror-scrollbar-filler" cm-not-content="true"></div><div class="CodeMirror-gutter-filler" cm-not-content="true"></div><div class="CodeMirror-scroll" tabindex="-1"><div class="CodeMirror-sizer" style="margin-left: 0px; margin-bottom: -16px; border-right-width: 34px; min-height: 60px; padding-right: 0px; padding-bottom: 0px;"><div style="position: relative; top: 0px;"><div class="CodeMirror-lines" role="presentation"><div role="presentation" style="position: relative; outline: none;"><div class="CodeMirror-measure"><pre class="CodeMirror-line-like"><span>xxxxxxxxxx</span></pre></div><div class="CodeMirror-measure"></div><div style="position: relative; z-index: 1;"></div><div class="CodeMirror-cursors"><div class="CodeMirror-cursor" style="left: 253.6px; top: 19.6px; height: 20.4px;">&nbsp;</div></div><div class="CodeMirror-code" role="presentation"><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"><span class="cm-string">"machine-learning,regression,linear-regression,regularization"</span></span></pre></div></div></div></div></div><div style="position: absolute; height: 34px; width: 1px; border-bottom: 0px solid transparent; top: 60px;"></div><div class="CodeMirror-gutters" style="display: none; height: 94px;"></div></div></div></div><div>
<p>We can then split on <code>,</code> to obtain a list.</p></div></div>
<p>
Things we'll do:
<p>
<ol>
<li>Fill in the missing values with <code>0</code>.</li>
<li>Set the types of each column in accordance to what was illustrated above.</li>
<li>Clean the <code>Tags</code> column and assign it back to itself:<ol>
<li>Use the process illustrated above.</li>
<li>Assign the result to <code>questions["Tags"]</code>.</li>
</ol>
</li>
</ol>

In [None]:
questions.fillna(0, inplace=True, axis=1)
# questions.head() # Test

questions["FavoriteCount"] = questions["FavoriteCount"].astype(np.int64)
questions.info()

In [None]:
questions["Tags"] = (
    questions["Tags"]
    .str.replace("^<|>$", "")
    .str.split("><")
)

questions.head(3) # Test

## Most Used and Most Viewed

We now focus on determining the most popular tags. We'll do so by considering two different popularity proxies: for each tag we'll count how many times the tag was used, and how many times a question with that tag was viewed.

We could take into account the score, or whether or not a question is part of someone's favorite questions. These are all reasonable options to investigate; but we'll limit the focus of our research to counts and views for now.

We will:
<div id="body" class="dq-lesson-markup dq-lesson-learn-instructions dq-max-w-screen-xl"><ol>
<li>Count how many times each tag was used.</li>
<li>Count how many times each tag was viewed.</li>
<li>Create visualizations for the top tags of each of the above results.</li>
</ol></div>

In [None]:
# how many times each tag was used:
from collections import Counter
# https://stackoverflow.com/questions/2600191/how-can-i-count-the-occurrences-of-a-list-item
def c_tags(tag):
    return dict(Counter(tag))

questions["tag_row_dict_count"] = questions["Tags"].apply(c_tags)
# https://stackoverflow.com/questions/52855168/how-to-find-sum-of-dictionaries-in-a-pandas-dataframe-across-all-rows
counts = sum(map(Counter, questions['tag_row_dict_count']), Counter())
tag_dict = pd.DataFrame.from_dict(counts, orient='index')

# # Creating small dataframe:

tag_dict.rename(columns={0: "Count tags"}, inplace=True)
tag_dict.sort_values(
    by=["Count tags"],
    axis=0,
    inplace=True,
    ascending=False
)

most_used = tag_dict.head(20)
most_used

In [None]:
tag_view_count = dict()

for index, row in questions.iterrows():
    for tag in row['Tags']:
        if tag in tag_view_count:
            tag_view_count[tag] += row['ViewCount']
        else:
            tag_view_count[tag] = row['ViewCount']
            
tag_view_count = pd.DataFrame.from_dict(
    tag_view_count,
    orient="index"
)

tag_view_count.rename(
    columns={0: "Viewed tags"},
    inplace=True
)

tag_view_count.sort_values(
    by=["Viewed tags"],
    axis=0,
    inplace=True,
    ascending=False
)
most_viewed = tag_view_count.head(20)
most_viewed

In [None]:
fig, axes = plt.subplots(nrows=1, ncols=2)
fig.set_size_inches((24, 10))
most_used.plot(kind="barh", ax=axes[0], subplots=True)
most_viewed.plot(kind="barh", ax=axes[1], subplots=True)
#let's do some grid, but only x-axis
axes[0].grid(color='xkcd:bright red', axis='x')
axes[1].grid(color='xkcd:bright red', axis='x')

for n in range(0,2):
    axes[n].spines['top'].set_visible(False)
    axes[n].spines['right'].set_visible(False)

## Relations Between Tags
<p>
<div class="dq-px-8"><div id="body" class="dq-lesson-markup dq-max-w-screen-xl"><div><p>For the next step:</p>
<ul>
<li><code>most_used</code> is a dataframe that counts how many times each of the top 20 tags was used. </li>
<li><code>most_viewed</code> is a dataframe that counts how many times each of the top 20 tags was viewed.</li>
</ul>
<p>Looking at the results from the last exercise, we see that most top tags are present in both dataframes.</p>
<p>Let's see what tags are in <code>most_used</code>, but not in <code>most_viewed</code>. We can identify them by the missing values in <code>ViewCount</code> below. </p>
</div><div class="react-codemirror2 dq-editor dq-mb-4 dq-p-2 dq-bg-gray-50 dark:dq-bg-gray-800"><div class="CodeMirror cm-s-default CodeMirror-wrap"><div style="overflow: hidden; position: relative; width: 3px; height: 0px; top: 47.6px; left: 330.4px;"><textarea autocorrect="off" autocapitalize="off" spellcheck="false" readonly="" tabindex="0" style="position: absolute; bottom: -1em; padding: 0px; width: 1000px; height: 1em; outline: none;"></textarea></div><div class="CodeMirror-vscrollbar" tabindex="-1" cm-not-content="true"><div style="min-width: 1px; height: 0px;"></div></div><div class="CodeMirror-hscrollbar" tabindex="-1" cm-not-content="true"><div style="height: 100%; min-height: 1px; width: 0px;"></div></div><div class="CodeMirror-scrollbar-filler" cm-not-content="true"></div><div class="CodeMirror-gutter-filler" cm-not-content="true"></div><div class="CodeMirror-scroll" tabindex="-1"><div class="CodeMirror-sizer" style="margin-left: 0px; margin-bottom: -16px; border-right-width: 34px; min-height: 80px; padding-right: 0px; padding-bottom: 0px;"><div style="position: relative; top: 0px;"><div class="CodeMirror-lines" role="presentation"><div role="presentation" style="position: relative; outline: none;"><div class="CodeMirror-measure"><span><span>​</span>x</span></div><div class="CodeMirror-measure"></div><div style="position: relative; z-index: 1;"></div><div class="CodeMirror-cursors"><div class="CodeMirror-cursor" style="left: 330.4px; top: 39.6px; height: 20.4px;">&nbsp;</div></div><div class="CodeMirror-code" role="presentation"><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"><span class="cm-variable">in_used</span> <span class="cm-operator">=</span> <span class="cm-variable">pd</span>.<span class="cm-property">merge</span>(<span class="cm-variable">most_used</span>, <span class="cm-variable">most_viewed</span>, <span class="cm-variable">how</span><span class="cm-operator">=</span><span class="cm-string">"left"</span>, <span class="cm-variable">left_index</span><span class="cm-operator">=</span><span class="cm-keyword">True</span>, <span class="cm-variable">right_index</span><span class="cm-operator">=</span><span class="cm-keyword">True</span>)</span></pre></div></div></div></div></div><div style="position: absolute; height: 34px; width: 1px; border-bottom: 0px solid transparent; top: 80px;"></div><div class="CodeMirror-gutters" style="display: none; height: 114px;"></div></div></div></div><div>
<p></p><center>
<table>
<thead>
<tr>
<th></th>
<th>Count</th>
<th>ViewCount</th>
</tr>
</thead>
<tbody>
<tr>
<th>machine-learning-model</th>
<td>224</td>
<td>NaN</td>
</tr>
<tr>
<th>statistics</th>
<td>234</td>
<td>NaN</td>
</tr>
<tr>
<th>clustering</th>
<td>257</td>
<td>33928.0</td>
</tr>
<tr>
<th>predictive-modeling</th>
<td>265</td>
<td>NaN</td>
</tr>
<tr>
<th>r</th>
<td>268</td>
<td>NaN</td>
</tr>
<tr>
<th>dataset</th>
<td>340</td>
<td>43151.0</td>
</tr>
<tr>
<th>regression</th>
<td>347</td>
<td>49451.0</td>
</tr>
<tr>
<th>pandas</th>
<td>354</td>
<td>201787.0</td>
</tr>
<tr>
<th>lstm</th>
<td>402</td>
<td>74458.0</td>
</tr>
<tr>
<th>time-series</th>
<td>466</td>
<td>64134.0</td>
</tr>
<tr>
<th>cnn</th>
<td>489</td>
<td>70349.0</td>
</tr>
<tr>
<th>nlp</th>
<td>493</td>
<td>71382.0</td>
</tr>
<tr>
<th>scikit-learn</th>
<td>540</td>
<td>128110.0</td>
</tr>
<tr>
<th>tensorflow</th>
<td>584</td>
<td>121369.0</td>
</tr>
<tr>
<th>classification</th>
<td>685</td>
<td>104457.0</td>
</tr>
<tr>
<th>keras</th>
<td>935</td>
<td>268608.0</td>
</tr>
<tr>
<th>neural-network</th>
<td>1055</td>
<td>185367.0</td>
</tr>
<tr>
<th>deep-learning</th>
<td>1220</td>
<td>233628.0</td>
</tr>
<tr>
<th>python</th>
<td>1814</td>
<td>537585.0</td>
</tr>
<tr>
<th>machine-learning</th>
<td>2693</td>
<td>388499.0</td>
</tr>
</tbody>
</table>
</center><p></p>
<p>Similarly, let's see what tags are in the latter, but not the former:</p>
</div><div class="react-codemirror2 dq-editor dq-mb-4 dq-p-2 dq-bg-gray-50 dark:dq-bg-gray-800"><div class="CodeMirror cm-s-default CodeMirror-wrap"><div style="overflow: hidden; position: relative; width: 3px; height: 0px; top: 47.6px; left: 167.2px;"><textarea autocorrect="off" autocapitalize="off" spellcheck="false" readonly="" tabindex="0" style="position: absolute; bottom: -1em; padding: 0px; width: 1000px; height: 1em; outline: none;"></textarea></div><div class="CodeMirror-vscrollbar" tabindex="-1" cm-not-content="true"><div style="min-width: 1px; height: 0px;"></div></div><div class="CodeMirror-hscrollbar" tabindex="-1" cm-not-content="true"><div style="height: 100%; min-height: 1px; width: 0px;"></div></div><div class="CodeMirror-scrollbar-filler" cm-not-content="true"></div><div class="CodeMirror-gutter-filler" cm-not-content="true"></div><div class="CodeMirror-scroll" tabindex="-1"><div class="CodeMirror-sizer" style="margin-left: 0px; margin-bottom: -16px; border-right-width: 34px; min-height: 80px; padding-right: 0px; padding-bottom: 0px;"><div style="position: relative; top: 0px;"><div class="CodeMirror-lines" role="presentation"><div role="presentation" style="position: relative; outline: none;"><div class="CodeMirror-measure"><pre class="CodeMirror-line-like"><span>xxxxxxxxxx</span></pre></div><div class="CodeMirror-measure"></div><div style="position: relative; z-index: 1;"></div><div class="CodeMirror-cursors"><div class="CodeMirror-cursor" style="left: 167.2px; top: 39.6px; height: 20.4px;">&nbsp;</div></div><div class="CodeMirror-code" role="presentation"><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"><span class="cm-variable">pd</span>.<span class="cm-property">merge</span>(<span class="cm-variable">most_used</span>, <span class="cm-variable">most_viewed</span>, <span class="cm-variable">how</span><span class="cm-operator">=</span><span class="cm-string">"right"</span>, <span class="cm-variable">left_index</span><span class="cm-operator">=</span><span class="cm-keyword">True</span>, <span class="cm-variable">right_index</span><span class="cm-operator">=</span><span class="cm-keyword">True</span>)</span></pre></div></div></div></div></div><div style="position: absolute; height: 34px; width: 1px; border-bottom: 0px solid transparent; top: 80px;"></div><div class="CodeMirror-gutters" style="display: none; height: 114px;"></div></div></div></div><div>
<p></p><center>
<table class="dataframe">
<thead>
<tr>
<th></th>
<th>Count</th>
<th>ViewCount</th>
</tr>
</thead>
<tbody>
<tr>
<th>clustering</th>
<td>257.0</td>
<td>33928</td>
</tr>
<tr>
<th>csv</th>
<td>NaN</td>
<td>38654</td>
</tr>
<tr>
<th>pytorch</th>
<td>NaN</td>
<td>40240</td>
</tr>
<tr>
<th>dataset</th>
<td>340.0</td>
<td>43151</td>
</tr>
<tr>
<th>regression</th>
<td>347.0</td>
<td>49451</td>
</tr>
<tr>
<th>numpy</th>
<td>NaN</td>
<td>49767</td>
</tr>
<tr>
<th>time-series</th>
<td>466.0</td>
<td>64134</td>
</tr>
<tr>
<th>cnn</th>
<td>489.0</td>
<td>70349</td>
</tr>
<tr>
<th>nlp</th>
<td>493.0</td>
<td>71382</td>
</tr>
<tr>
<th>lstm</th>
<td>402.0</td>
<td>74458</td>
</tr>
<tr>
<th>dataframe</th>
<td>NaN</td>
<td>89352</td>
</tr>
<tr>
<th>classification</th>
<td>685.0</td>
<td>104457</td>
</tr>
<tr>
<th>tensorflow</th>
<td>584.0</td>
<td>121369</td>
</tr>
<tr>
<th>scikit-learn</th>
<td>540.0</td>
<td>128110</td>
</tr>
<tr>
<th>neural-network</th>
<td>1055.0</td>
<td>185367</td>
</tr>
<tr>
<th>pandas</th>
<td>354.0</td>
<td>201787</td>
</tr>
<tr>
<th>deep-learning</th>
<td>1220.0</td>
<td>233628</td>
</tr>
<tr>
<th>keras</th>
<td>935.0</td>
<td>268608</td>
</tr>
<tr>
<th>machine-learning</th>
<td>2693.0</td>
<td>388499</td>
</tr>
<tr>
<th>python</th>
<td>1814.0</td>
<td>537585</td>
</tr>
</tbody>
</table>
</center><p></p>
<p>The tags present in <code>most_used</code> and not present in <code>most_viewed</code> are:</p>
<ul>
<li><code>machine-learning-model</code></li>
<li><code>statistics</code></li>
<li><code>predictive-modeling</code></li>
<li><code>r</code></li>
</ul>
<p>And the tags present in <code>most_viewed</code> but not in <code>most_used</code> are:</p>
<ul>
<li><code>csv</code></li>
<li><code>pytorch</code></li>
<li><code>dataframe</code></li>
</ul>
<p>Some tags also stand out as being related. For example, <code>python</code> is related to <code>pandas</code>, as we can find both pythons and pandas in the same country — or better yet, because pandas is a Python library. So by writing about pandas, we can actually simultaneously tackle two tags.</p>
<p>Other pairs of tags, shouldn't be related at all, like <code>pandas</code> and <code>r</code>:</p>
</div><div class="react-codemirror2 dq-editor dq-mb-4 dq-p-2 dq-bg-gray-50 dark:dq-bg-gray-800"><div class="CodeMirror cm-s-default CodeMirror-wrap"><div style="overflow: hidden; position: relative; width: 3px; height: 0px; top: 68px; left: 13.6px;"><textarea autocorrect="off" autocapitalize="off" spellcheck="false" readonly="" tabindex="0" style="position: absolute; bottom: -1em; padding: 0px; width: 1000px; height: 1em; outline: none;"></textarea></div><div class="CodeMirror-vscrollbar" tabindex="-1" cm-not-content="true"><div style="min-width: 1px; height: 0px;"></div></div><div class="CodeMirror-hscrollbar" tabindex="-1" cm-not-content="true"><div style="height: 100%; min-height: 1px; width: 0px;"></div></div><div class="CodeMirror-scrollbar-filler" cm-not-content="true"></div><div class="CodeMirror-gutter-filler" cm-not-content="true"></div><div class="CodeMirror-scroll" tabindex="-1"><div class="CodeMirror-sizer" style="margin-left: 0px; margin-bottom: -16px; border-right-width: 34px; min-height: 100px; padding-right: 0px; padding-bottom: 0px;"><div style="position: relative; top: 0px;"><div class="CodeMirror-lines" role="presentation"><div role="presentation" style="position: relative; outline: none;"><div class="CodeMirror-measure"><pre class="CodeMirror-line-like"><span>xxxxxxxxxx</span></pre></div><div class="CodeMirror-measure"></div><div style="position: relative; z-index: 1;"></div><div class="CodeMirror-cursors"><div class="CodeMirror-cursor" style="left: 13.6px; top: 60px; height: 20px;">&nbsp;</div></div><div class="CodeMirror-code" role="presentation"><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"><span class="cm-variable">questions</span>[<span class="cm-variable">questions</span>[<span class="cm-string">"Tags"</span>].<span class="cm-property">apply</span>(</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; &nbsp;<span class="cm-keyword">lambda</span> <span class="cm-variable">tags</span>: <span class="cm-keyword">True</span> <span class="cm-keyword">if</span> <span class="cm-string">"r"</span> <span class="cm-keyword">in</span> <span class="cm-variable">tags</span> <span class="cm-keyword">and</span> <span class="cm-string">"pandas"</span> <span class="cm-keyword">in</span> <span class="cm-variable">tags</span> <span class="cm-keyword">else</span> <span class="cm-keyword">False</span>)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">]</span></pre></div></div></div></div></div><div style="position: absolute; height: 34px; width: 1px; border-bottom: 0px solid transparent; top: 100px;"></div><div class="CodeMirror-gutters" style="display: none; height: 134px;"></div></div></div></div><div>
<table class="dataframe">
<thead>
<tr>
<th></th>
<th>Id</th>
<th>CreationDate</th>
<th>Score</th>
<th>ViewCount</th>
<th>Tags</th>
<th>AnswerCount</th>
<th>FavoriteCount</th>
</tr>
</thead>
<tbody>
<tr>
<th>2873</th>
<td>60074</td>
<td>2019-09-11 20:35:17</td>
<td>0</td>
<td>22</td>
<td>[r, pandas, dplyr]</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<th>3651</th>
<td>49148</td>
<td>2019-04-11 19:41:39</td>
<td>1</td>
<td>83</td>
<td>[r, data-mining, pandas, matlab, databases]</td>
<td>3</td>
<td>0</td>
</tr>
</tbody>
</table>
<p>Just two results. We can look at these questions by replacing <code>ID</code> in <code>https://datascience.stackexchange.com/questions/ID</code> with the questions' <code>Id</code>s values and see what they are about.</p></div></div></div>

-------------------

<div id="body" class="dq-lesson-markup dq-lesson-learn-instructions dq-max-w-screen-xl"><p>The goal of this part is to make us think about technical solutions to determining how tags are related.</p>
<ol>
<li>Brainstorm some ways in which we could find relationships between pair of tags.</li>
<li>Brainstorm some ways in which we could find relationships between multiple tags.</li>
</ol></div>

-----------------------
Analyzing the data, we came to the conclusion that due to the very large dependence between the tags: indirect and direct, an attempt to further analyze it will not bring any significant benefit. The data we have obtained up to this point is sufficient.

Is it possible that having <a href="https://en.wikipedia.org/wiki/Domain_knowledge" target="_blank">domain knowledge</a> can give us more insights, because of that, let's see how it can help us here.</p>
<p>We have noticed that the most used tags are also the most viewed. From the top 10 tags of each, here's a list of the tags in common: <p><code>python</code>, <p><code>machine-learning</code>, <p><code>deep-learning</code>, <p><code>neural-network</code>, <p><code>keras</code>, <p><code>tensorflow</code>, <p><code>classification</code>, <p><code>scikit-learn</code>.</p>

Trying to answer the question: could there be strong relations between them? We checked [tags page on DSSE](https://datascience.stackexchange.com/tags).

**From our research, we can conclude that there is a strong relation between them. It's fair to say that they are related in some shape, or form of hierarchy.**



--------------------------------
## Just a Fad?
<p>
Before we officially make our recommendation, it would be nice to solidify our findings with additional proof. More specifically, one thing that comes to mind is "Is deep learning just a fad?" Ideally, the content we decide to create will be the most useful for as long as possible. Could interest in deep learning be slowing down? Back to SEDE!</p>
<p>The file <code>all_questions.csv</code> holds the result of the query below — this query fetches all of the questions ever asked on DSSE, their dates and tags.</p>
</div><div class="react-codemirror2 dq-editor dq-mb-4 dq-p-2 dq-bg-gray-50 dark:dq-bg-gray-800"><div class="CodeMirror cm-s-default CodeMirror-wrap"><div style="overflow: hidden; position: relative; width: 3px; height: 0px; top: 48px; left: 215.2px;"><textarea autocorrect="off" autocapitalize="off" spellcheck="false" readonly="" tabindex="0" style="position: absolute; bottom: -1em; padding: 0px; width: 1000px; height: 1em; outline: none;"></textarea></div><div class="CodeMirror-vscrollbar" tabindex="-1" cm-not-content="true"><div style="min-width: 1px; height: 0px;"></div></div><div class="CodeMirror-hscrollbar" tabindex="-1" cm-not-content="true"><div style="height: 100%; min-height: 1px; width: 0px;"></div></div><div class="CodeMirror-scrollbar-filler" cm-not-content="true"></div><div class="CodeMirror-gutter-filler" cm-not-content="true"></div><div class="CodeMirror-scroll" tabindex="-1"><div class="CodeMirror-sizer" style="margin-left: 0px; margin-bottom: -17px; border-right-width: 33px; min-height: 80px; padding-right: 0px; padding-bottom: 0px;"><div style="position: relative; top: 0px;"><div class="CodeMirror-lines" role="presentation"><div role="presentation" style="position: relative; outline: none;"><div class="CodeMirror-measure"><pre class="CodeMirror-line-like"><span>xxxxxxxxxx</span></pre></div><div class="CodeMirror-measure"></div><div style="position: relative; z-index: 1;"></div><div class="CodeMirror-cursors"><div class="CodeMirror-cursor" style="left: 215.2px; top: 40px; height: 20px;">&nbsp;</div></div><div class="CodeMirror-code" role="presentation"><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"><span class="cm-keyword">SELECT</span> Id<span class="cm-punctuation">,</span> CreationDate<span class="cm-punctuation">,</span> Tags</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;<span class="cm-keyword">FROM</span> posts</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> <span class="cm-keyword">WHERE</span> PostTypeId <span class="cm-operator">=</span> <span class="cm-number">1</span><span class="cm-punctuation">;</span></span></pre></div></div></div></div></div><div style="position: absolute; height: 33px; width: 1px; border-bottom: 0px solid transparent; top: 80px;"></div><div class="CodeMirror-gutters" style="display: none; height: 113px;"></div></div></div></div><div>
<p>In this we will track the interest in deep learning across time. We will:</p>
<ul>
<li>Count how many deep learning questions are asked per time period.</li>
<li>The total amount of questions per time period.</li>
<li>How many deep learning questions there are relative to the total amount of questions per time period.</li>
</ul></div></div></div>

<p>For sake of this, we will:
    <div id="body" class="dq-lesson-markup dq-lesson-learn-instructions dq-max-w-screen-xl"><ol>
<li>Read the file <code>all_questions.csv</code> into a dataframe.</li>
<li>Transform the tags column in a similar manner to what was previously done.</li>
<li>Think about what questions should be classified as deep learning questions and the implement that definition.</li>
<li>Decide on an adequate timeframe and track interest in deep learning across that timeframe:<ul>
<li>Count how many deep learning questions are asked per time period.</li>
<li>The total amount of questions per time period.</li>
<li>How many deep learning questions there are relative to the total amount of questions per time period;</li>
</ul>
</li>
<li>Write our observations and final recommendation in a markdown cell.</li>
</ol></div>

In [None]:
all_qu = pd.read_csv("all_questions.csv")
all_qu.head()

In [None]:
# Clean the Tags column and assign it back to itself:
all_qu["Tags"] = (
    all_qu["Tags"]
    .str.replace("^<|>$", "")
    .str.split("><")
)

In [None]:
"""
Let's do the same counting of tags per ech `Tags` as with the previous dataframe.
"""

# how many times each tag was used:
from collections import Counter
# https://stackoverflow.com/questions/2600191/how-can-i-count-the-occurrences-of-a-list-item
def c_tags(tag):
    return dict(Counter(tag))

all_qu["tag_row_dict_count"] = all_qu["Tags"].apply(c_tags)
# https://stackoverflow.com/questions/52855168/how-to-find-sum-of-dictionaries-in-a-pandas-dataframe-across-all-rows
counts_qu = sum(
    map(Counter, all_qu['tag_row_dict_count']),
    Counter()
)

tag_dict_qu = pd.DataFrame.from_dict(
    counts_qu,
    orient='index'
)

# let's drop inecessairy column:
all_qu.drop("tag_row_dict_count", axis=1, inplace=True)

# # Creating small dataframe:

tag_dict_qu.rename(
    columns={0: "Count tags for all_qu"},
    inplace=True
)

tag_dict_qu.sort_values(
    by=["Count tags for all_qu"],
    axis=0,
    inplace=True,
    ascending=False
)

most_used_qu = tag_dict_qu.head(20)
most_used_qu

Before we decide which questions should be classified as being `deep learning` questions, we should decide what tags are deep learning tags.
`Deep learning`'s scope  include:
- `lstm`
- `CNN`
- `scikit-learn`
- `tensorflow`
- `keras`
- `neural-network`
- `deep-learning`

In [None]:
def deep_learning_filter(tags):
    for tag in tags:
        if tag in ["lstm", "cnn", "scikit-learn", "tensorflow",
                   "keras", "neural-network", "deep-learning"]:
            return 1
    return 0
all_qu["DeepLearning"] = all_qu["Tags"].apply(deep_learning_filter)

all_qu

In [None]:
all_qu['CreationDate'] = pd.to_datetime(
    all_qu['CreationDate'],
    format='%Y-%m-%d %H:%M:%S'
)

all_qu.sort_values(
    by='CreationDate',
    inplace=True,
    ascending=False
)

all_qu

Since we don't have almost any data from 2020, we will drop this year.

In [None]:
all_qu = all_qu[all_qu["CreationDate"].dt.year < 2020]
# Test:
# all_qu.head(5)

In [None]:
all_qu_mod = all_qu.copy()
# year = all_qu["CreationDate"].dt.year
# month = all_qu["CreationDate"].dt.month
# all_qu_mod["year"] = year
# all_qu_mod["month"] = month
# all_qu_mod["month"] = all_qu_mod["month"].astype(int)

# https://stackoverflow.com/questions/41369227/formatting-quarter-time-in-pandas-columns
all_qu_mod["Quarter"] = (all_qu_mod['CreationDate']
                         .dt.to_period("Q")
                        )

all_qu_mod["Quarter"] = (all_qu_mod["Quarter"]
                         .astype(str)
                         .str.replace('Q', ' Q: ')
                        )

In [None]:
quarters = (
    all_qu_mod.groupby(['Quarter'])
    .agg({'DeepLearning':['sum', 'size']})
)
quarters.columns = [
    "Deep learning count",
    "Total amount of questions"
]

# PCT ration of the `deep learning count`:
quarters["Deep Learning PCT"] = (
    round((quarters["Deep learning count"]\
           /quarters["Total amount of questions"]) * 100, 2)
)

quarters.reset_index(inplace=True)

quarters.sort_values(
    by="Quarter",
    inplace=True,
    ascending=True
)

quarters.head()

In [None]:
ax = quarters.plot(x="Quarter", y="Deep Learning PCT",
                    kind="bar", color="grey",
                    figsize=(18,4), rot=45
                    )
ax.set_title("Deep Learning PCT", fontsize=40)
ax.tick_params(axis="both", 
               labelsize=20,
               left = False)
ax.grid(color='xkcd:bright red', axis='y')
ax.spines["top"].set_visible(False)
ax.spines["right"].set_visible(False)

In [None]:
ax2 = quarters.plot(
    x="Quarter",
    y="Total amount of questions",
    kind="bar",
    secondary_y=True,
    alpha=0.7,
    rot=45,
    figsize=(24,12),
    legend=None
)

ax3 = quarters.plot(
    x="Quarter",
    y="Deep learning count",
    kind="bar",
    ax=ax2,
    secondary_y=True,
    alpha=0.7,
    rot=45,
    color="lime",
    legend=None
)

for ax in (ax2, ax3):
    ax.set_title("Deep Learning vs Total", fontsize=40)
    ax.tick_params(axis="both", 
                   labelsize=20,
                   left = False)
    ax.grid(color='xkcd:bright red', axis='y')
    ax.legend(loc='upper left', fontsize=28)

It seems that deep learning questions was a high-growth trend from 2015 to 2018. From the second quarter of 2015, almost from the start of `DSSE`. With plateau from `4Q` in `2017` to `2Q` in `2018`. 

It seems that the interest in `deep learning` is constant now `Q3 2018` - in comparison to the volume of the total value of questions. But, we must point out: there is an increasing trend on this platform: an almost constant increase of questions, topics in the `DSSE`.

Because of that we surely can say that DSSE is a good focal point for writing. Content is up to date and in constant development. Lastly, as we said before, it's hard to distinguish `deep learning` from other subjects from DSSE. There is no need to see deep learning as some sort of separate science field.