# Rubik's Cubes: An Analysis of Scrambles
Jaraad Kamal

## Background and Definitions
The Rubik's Cube was a puzzle game created by Ernő Rubik in 1974. Throughout the years it has gained immense popularity as more and more people learn to solve it and compete. 
I have been solving Rubik's Cubes for over half my life. I am by no means a competition level solver but I have always been interested in the highest levels of speed cubing.
> **Speed Cubing**
> <br>
> Competitively solving Rubik's Cubes as fast as possible.

In this tutorial we will discuss the scrambles of a **3x3 Rubik's Cube**. 
> **Scramble**
> <br>
> A random set of moves used to get a Rubik's Cube or puzzle into a random unsolved state.
> <br>
> The moves needed to "mix up" a Rubk's Cube.
### Goal
Competitions have been going on since the 1980s. For this tutorial we will be finding out if there were specific scrambles (or initial shuffles of the cube) that are harder to solve than others. Futher we can develop a model to determine if a particular scramble is more difficult than others.


### Notation
Before getting into the data and the code we must first develop an understanding for how notation works in Rubik's Cubes.
<br>
The notation is used as a way to describe which moves are being performed.
<br><br>
There are 4 types of moves for a 3x3 Rubik's Cube:
- Whole cube rotations
- Face Turns
- Wide Moves
- Slice Moves

> **Visual Depictions**
> <br>https://jperm.net/3x3/moves

For the purposes of this tutorial only **Face Turns** will be examined as they are the only types of moves used when scrabling.
<br>
*(note: every type of move can be accomplished with only face turns)*

#### Face Turns
Each Face Turn corresponds to a particular face of the cube.
The basic moves are: 

|Name   | Notation| Variant  |
|----   | --------| ---------|
| Up    | U       | U2 or U' |
| Down  | D       | D2 or D' |
| Left  | L       | L2 or L' |
| Right | R       | R2 or R' |
| Front | F       | F2 or F' |
| Back  | B       | B2 or B' |


Each moves corresponds to one of the 6 faces of the cube.
<br>
They indicate moving a face of the cube clockwise 90 degrees (when viewing the face head on). The addition of `2` means rorate the face 180 degrees (90 degrees twice) and an apostrophe `'` (pronounced *prime*) dictates a counterclockwise rotation.  
> **Example**
><br>
> `U ` means move the top most face 90 degrees **clockwise**
><br>
> `U2` means move the top most face 180 degrees
><br>
> `U'` means move the top most face 90 degrees **counterclockwise**

A typical scramble thus looks like:
<br>
`D U2 F2 D R2 D2 L2 U' R2 B' R2 D F U2 F L2 R' D L`

Now that we understand how notations and scrambles work we can get into the actual data wrangling.

## Getting the Data
For this project I will be using the database created by the **World Cube Association** (WCA), they host the largest and most updated database for competitions throughout the world. I will be using the data up to April 4th, 2022.
> **Links**
><br>
> WCA Homepage: https://www.worldcubeassociation.org/
><br>
> WCA Database Download: https://www.worldcubeassociation.org/results/misc/export.html

For this tutorial download the sql file and extract the contents into a subfolder of your choice.

## Getting to Know the Data
The file created by the WCA is *big*. Before blindly coding it is important that you get comfortable with the way it is formatted. The database comes with a **README** file that gives an overview. 
<br><br>
Briefly, the file (or SQL Database) is a collection of tables each with their own information. According to the **README** the database itself consists of the following tables:
>| Table                                   | Contents                                           |
>| --------------------------------------- | -------------------------------------------------- |
>| Persons                                 | WCA competitors                                    |
>| Competitions                            | WCA competitions                                   |
>| Events                                  | WCA events (3x3x3 Cube, Megaminx, etc)             |
>| Results                                 | WCA results per competition+event+round+person     |
>| RanksSingle                             | Best single result per competitor+event and ranks  |
>| RanksAverage                            | Best average result per competitor+event and ranks |
>| RoundTypes                              | The round types (first, final, etc)                |
>| Formats                                 | The round formats (best of 3, average of 5, etc)   |
>| Countries                               | Countries                                          |
>| Continents                              | Continents                                         |
>| Scrambles                               | Scrambles                                          |
>| championships                           | Championship competitions                          |
>| eligible_country_iso2s_for_championship | See explanation below                              |

For this tutorial we are examining if there are specific scrambles that are harder than others. To do this we will only look at the `Scrambles` and the `Results` tables.

## The Code
For this tutorial I will be working in **Python 3**.
<br>
First lets import some libraries that will be usefull later on.

In [2]:
import pandas as pd
import matplotlib.pyplot as plt
import mysql

### Loading the Database
The following code opens up the file from WCA and allows us to perform operations on the data.

In [7]:
sqlite_file = 'extracted_sql/WCA_export.sql' # This can be replaced with the path to the downloaded file
conn = sqlite3.connect(sqlite_file)
cursor = conn.cursor()

Now we can take a small look at the data. The following code will read the information from the `Scrambles` and `Results` tables.

In [8]:
q = "SHOW TABLES"

DatabaseError: Execution failed on sql 'SHOW TABLES': near "SHOW": syntax error

As good hygiene we will now close the database using the following code.
> Closing the database allows any changes to commited to the file and finishes any important tasks.

In [5]:
conn.close()