# Setup

**CS5483 Data Warehousing and Data Mining**
___

In [None]:
%reset -f
from IPython import display

## JupyterHub

### How to access the JupyterHub Server?

1. Enter the url of the Jupyterhub server [ltjh.cs.cityu.edu.hk](https://ltjh.cs.cityu.edu.hk) in a web browser.
1. Enter your [EID](https://www.cityu.edu.hk/esu/eid.htm) and Password in the fields `Username` and `Password` respectively.
1. Click the `Sign In` button.

**Tips**
- If the browser is stuck at the following page loading the server, `refresh` your browser.  
  ![server stuck](https://www.cs.cityu.edu.hk/~ccha23/cs1302/server_stuck.png)
- If you see the following page with ``My Server`` button, click on that button.  
  ![server start](https://www.cs.cityu.edu.hk/~ccha23/cs1302/server_start.png)
- If you see the ``Start My Server`` button instead, click on that button to start your server.  
  ![server stopped](https://www.cs.cityu.edu.hk/~ccha23/cs1302/server_stopped.png)
- For other issues, try logging out using the `Logout` button at the top right-hand corner, and then logging in again. You may also click the `Control Panel` button and restart your server.

### How to access course notebooks?

The preferred method (with version control built-in) is to 
1. go to the [course homepage](https://www.cs.cityu.edu.hk/~ccha23/cs5483/), and 
1. click a link with the notebook extension (*.ipynb) to open the notebook. 

This uses [nbgitpuller][nbgitpuller] to pull the most updated version automatically from the GitHub repository:  

- Jupyter Book: <https://ccha23.github.io/cs5483>

where you may also choose to open the notebook in CoLab as a backup.

[nbgitpuller]: https://jupyterhub.github.io/nbgitpuller

In [None]:
display.IFrame(src="https://cityuhk-lms.ap.panopto.com/Panopto/Pages/Embed.aspx?id=e1fad469-56cf-47d0-9a26-acae003c70fe&autoplay=false&offerviewer=true&showtitle=true&showbrand=false&start=0&interactivity=all",height=450,width=800)

Alternatively, you can also fetch a notebook in the classic notebook interface as follows. Unlike the previous method, this does not refetch and merge changes:

1. Click the `Assignments` tab, and ensure the correct course code is chosen in the drop down list.
1. In the `Released assignments` panel, click the button `Fetch` to download an assignment, if any. 
1. If the assignment has been fetched, it should appear in the `Downloaded assignments` panel. 
1. Click the arrow next to the assignment to show its content.  
1. Click each item to open the corresponding notebook.

**Tips**
- Note that all the downloaded course materials will be placed under a folder (named by the course code) in your home directory, so you need not go to the `Assignments` tab again to open the downloaded materials. 
E.g., you can access the notebook as follows:
    1. Going to the `File` tab, which is the default JupyterHub homepage after login or when you click the logo on the top left-hand corner.
    1. Enter the notebook URL following the [documentation](https://jupyterhub.readthedocs.io/en/stable/reference/urls.html).
- If for any reason you want to refetch an assignment again, you have to do one of the followings:
    1. Rename your assignment folder to a different name by selecting the folder and click rename. 
    1. Remove the folder by evaluating `!rm -rf {path}` in a code cell where `{path}` is the path to the assignment folder. (Be very cautious as removed folders cannot be recovered.)


## Jupyter Notebook

### How to provide your answers?

After opening a notebook:
1. Click `Help->User Interface Tour` to learn the classic notebook interface. 
1. Click `Help->Notebook Help` and skim through the tutorials on `Running Code` and `Working with Markdown Cells`.

**Exercise** The first program to write is often the ["Hello, World!"](https://en.wikipedia.org/wiki/%22Hello,_World!%22_program) program, which says hello to the world. Type the program `print('Hello, World!')` below and run it with `Shift+Enter`.

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

We often ask you to write your answer in a particular cell. Make sure you replace `YOUR ANSWER HERE` or
```Python
# YOUR CODE HERE
raise NotImplementedError()
```
by your answers. The line `raise NotImplementedError()` will raise an error when you execute the cell, to inform you that an answer is needed.

There may be visible and/or hidden tests to check your answers automatically. The following is a visible test you can run to check your answer: The test returns an assertion error if your program does not print the correct message.

In [None]:
# tests: Run this test cell right after running your "Hello, World!" program.
import sys, io
old_stdout, sys.stdout = sys.stdout, io.StringIO()
exec(In[-2])
printed = sys.stdout.getvalue()
sys.stdout = old_stdout
assert printed == 'Hello, World!\n'

**Tips**
1. Do not duplicate or delete a solution or test cell because they must have a unique id attached for auto-grading.
1. You can repeatedly modify your solution and run the test cell until your solution passes the tests. 
1. To assess your solution thoroughly, we may run new tests hidden from you after we have collected your notebook.
1. You can click the `Validate` button to run all the visible tests.
1. If you open the same notebook multiple times in different browser windows, be careful in making changes in different windows. Inconsistent changes may lead to conflicts or loss of your data.
1. If your notebook fails to run any code, the Kernel might have died. You can restart the kernel with `Kernel->Restart`. If restarting fails, check your code cells to see if there is any code that breaks the kernel.

### How to submit a notebook?

To submit your notebook:
1. Go to `Assignment` tab of JupyterHub where you fetched the Lab assignment. 
1. Expand the assignment folder and click the `Validate` button next to the notebook(s) to check if all visible tests pass.
1. Click `Submit` to submit your notebook. 
1. You may submit as many times as you wish before the deadline as we will collect your latest submission by the deadline. Submissions after the deadline are not collected.

In [None]:
display.IFrame(src="https://cityuhk-lms.ap.panopto.com/Panopto/Pages/Embed.aspx?id=994d9e5c-d9f8-40d0-9322-acae003d6b36&autoplay=false&offerviewer=true&showtitle=true&showbrand=false&start=0&interactivity=all", height=450, width=800,)

## Advanced Usage

### Print or backup a notebook

To convert a notebook to pdf, we can print it to pdf instead: 
- `File->Print Preview`

However, animation and video cannot be properly printed. You are highly recommended to takes notes on the dynamic notebook instead on the hard copy.

To download a copy of your notebook:
- `File->Download as->Notebook (.ipynb)`

You can run the notebook
- locally using [Anaconda](https://www.anaconda.com/products/individual), or
- remotely on other JupyHub services such as [Google Colab](https://colab.research.google.com/).

However, you would need to learn how to manage and install the additional packages required by the course.

### Jupyter Lab and extensions

Instead of the classic notebook interface, you may also play with the new interface called JupyterLab by visiting <https://ltjh.cs.cityu.edu.hk/user-redirect/lab>. Note that the new interface does not support the validation and submission of assignments.

You may use the [visual debugger](https://github.com/jupyterlab/debugger) in JupyterLab to debug a jupyter notebook. To do so, you should open the notebook with `XPython` as the kernel.

Both the notebook/lab interface is extensible. For the notebook interface, you can enable extensions from the [nbextensions page](https://ltjh.cs.cityu.edu.hk/user-redirect/nbextensions).

### Visual Studio Code

For a complete IDE experience, you can open VS Code as follows:
- <https://ltjh.cs.cityu.edu.hk/user-redirect/vscode>
- In classic notebook interface: `File` tab -> `New` menu -> `VS Code` menu item.
- In JupyterLab interface: `File` menu -> `New Launcher` menu item -> `VS Code` icon

### Version Control

By default, your changes to a notebook is automatically saved. You can also create a checkpoint by going to `File` tab->`Save and Checkpoint` menu item.

If you want to create multiple checkpoints/versions, the JupyterHub server provides the following tools for version control:
- [jupyterlab-git][jupyterlab-git]
- [vscode-git][vscode-git]
- [nbdiff][nbdiff]

See the link to GitHub in the next section and also [this post][code-server-github-login] to set up a GitHub login token.

[jupyterlab-git]: https://github.com/jupyterlab/jupyterlab-git 
[vscode-git]: https://code.visualstudio.com/docs/introvideos/versioncontrol
[code-server-github-login]: https://github.com/cdr/code-server/issues/1883
[nbdiff]: https://github.com/tarmstrong/nbdiff

### Terminal

You can launch terminals in both JupyterLab and VSCode interfaces. The Python and XPython kernels uses Anaconda. To activate it in terminal, excute the following command: 
```bash
\opt\anaconda\condabin\conda activate
```

## Optional Signup of Other Accounts

- Create a [professional GitHub account](https://education.github.com/pack) free for students. This can be useful for collaborative work such as the group project.

- Create a [CityU Google GApps account](https://www.cityu.edu.hk/csc/deptweb/support/faq/email/GApps_faq.htm). The account comes with an unlimited google drive storage as well, which can be useful for storing/sharing project dataset and video presentation.