<h2><b>UCDSC Package Overview</b></h2>

This is a notebook covering the functionality of UConn Data Science Club's Python library. This library was originally built in January 2025 for the purpose of having the club's technical workshops be more accessible to its audience. Below are the functions and capabilities the package contains.

<h4><b>Installation & Updating</b></h4>


To install the package, the user can either run ```pip install uconndatascienceclub``` in their terminal, or they can run the code cell below in a Jupyter Notebook file. 

In [1]:
!pip install uconndatascienceclub



It is worth noting that for every technical workshop, the package may be updated to accomodate for new content. If the user already has the package installed, then they need to update it at the start of the workshop. The latest version of the package will be given at that time, but they can run the command ```pip install uconndatascienceclub=={version}```. For example: ```pip install uconndatascienceclub==2025.1.12```.

The user can import the package into their environment with the following line:

In [1]:
import uconndatascienceclub as ucdsc

<h4><b>Basic Commands</b></h4>

```ucdsc.socials```: returns a dictionary containing links to our club's various social media links & forms of contact.  
```ucdsc.instagram```: returns a link to our club's instagram page.  
```ucdsc.discord```: returns an invitation link to our club's discord server.  
```ucdsc.uconntact```: returns a link to our club's UConntact page.  
```ucdsc.email```: returns the email address of our club.  

<h4><b>Data() class</b></h4>

The UCDSC package comes with a Data() class, mainly used for club members to easily access data being used live in a technical workshop.

<b>Initialization</b>  

The code below initializes a ```Data()``` object:

In [None]:
x = ucdsc.Data()

<b>Class Arguments: </b>  

- ```dataset```: *str* (default: ```None```)
    - Sets the instance's dataset. String must be a reference to an existing dataset in the package (see ```list_datasets()```).

<b>Methods: </b>  

- ```dataframe()```
    - Extracts the Pandas DataFrame object of the current dataset. Requires a dataset to be set (see ```set_data()```).
    - *Returns*: ```pandas.DataFrame()```


- ```list_datasets()```
    - Lists the available datasets (strings) that can be passed into the instance.
    - *Returns*: ```list```

- ```set_data(data)```
    - Sets the instance's dataset for use. The dataset can also be passed in as a parameter when initializing the instance.
    - *Arguments*:
        - ```data (str)```: String of dataset to establish. String must be a reference to an existing dataset in the package (see ```list_datasets()```).
    - *Returns*: ```None```

- ```save()```
    - Saves the csv file of the currently set dataset. Requires a dataset to be set (see ```set_data()```).
    - *Returns*: ```None```

- ```standard(dim=1, size=100, state=None)```
    - Creates a Pandas DataFrame containing columns following the standard normal distribution.
    - *Arguments*: 
        - ```dim (int)```: Sets number of columns in the dataframe. *Default=1*.
        - ```size (int)```: Sets number of observations (sample size) in the dataframe. *Default=100*.
        - ```state (int)```: Sets the random state when generating data. *Default=None*.
    - *Returns*: ```pandas.DataFrame()```

- ```uniform(dim=1, size=100, state=None)```
    - Creates a Pandas DataFrame containing columns following the uniform distribution.
    - *Arguments*: 
        - ```dim (int)```: Sets number of columns in the dataframe. *Default=1*.
        - ```size (int)```: Sets number of observations (sample size) in the dataframe. *Default=100*.
        - ```state (int)```: Sets the random state when generating data. *Default=None*.
    - *Returns*: ```pandas.DataFrame()```


<b>Sample Implementation</b>  

In [3]:
x = ucdsc.Data('mall')
df = x.dataframe()
df.head()

Unnamed: 0,CustomerID,Genre,Age,Annual Income (k$),Spending Score (1-100)
0,1,Male,19,15,39
1,2,Male,21,15,81
2,3,Female,20,16,6
3,4,Female,23,16,77
4,5,Female,31,17,40


<h4><b>script_writer</b></h4>

The user can import the ```script_writer``` script with the following code:

In [1]:
from uconndatascienceclub import script_writer

The ```script_writer``` script allows the user to have an entire technical workshop script (whether it be .ipynb or .py) written into their current working directory. Below are the functions used within ```script_writer```.

- ```available_dates()```
    - Lists dates of previous technical workshops that scripts can be written from.
    - *Returns*: ```list```

- ```write(date)```
    - Writes the script(s) of the technical workshop from the given input date into the user's current working directory.
    - *Arguments*:
        - ```date (str)```: The date from which the script was demonstrated. Date must be from an available technical workshop (see ```available_dates()```).
    - *Returns*: ```None```

<b>Sample Implementation</b>  

In [None]:
from uconndatascienceclub import script_writer

script_writer.write('1/30/2025') # advanced data visualization workshop