# Budget Runner

In [1]:
import openpyxl
import math
import io
import os
import json
import calendar

import pandas as pd
import numpy as np

import csv
from csv import DictReader
from openpyxl import Workbook, load_workbook
from datetime import date, datetime
from pathlib import Path
from itertools import chain
import matplotlib.pyplot as plt

In [2]:
# first_fiscal = "2023-01-01"
# last_fiscal = "2023-12-31"

In [3]:
%load_ext autoreload
%autoreload 2

## Github Link

To view the repository, click the link here: https://github.com/Aaron-M-R/Budget

## Purpose
After having a bank account for quite some time, I decided to keep better track of my spending. While many people either manually look at every charge in their card's history or don't look at all, I wanted to automatically sort my spending and sum the charges based on category. I then created this project in order to do just that. I focused mainly on finding a way to see how much money I spent on food or on transportation and also keep track of my income. I also wanted to easily visualize my spending over time. This program can do all of that.


## Instructions
First, make sure you have cloned the respository called Budget (click [here](https://github.com/Aaron-M-R/Budget) to access). Then, add a spreadsheet of your spending as an excel workbook to the repo. The columns of the spreadsheet should be in the following order: date, status, type, check number, description, withdrawal, deposit, running balance. You should now be able to either run the program (titled BudgeJudy.py) in a terminal or open and run the jupyter notebook (titled BudgetRunner.ipynb) remotely on your machine. Restarting the kernel and running all of the cells in the notebook will first run the python program, and then provide some additional visualizations.


## Use
Enter the name of your excel workbook, pick the range of dates you want to analyze and enter their day, month and year separately. Then, enter what you want the title of your new sheet you to be. The program will then update the excel workbook by making opening a new sheet and filling it with you spending per month on each category. The program then creates a bar plot of total spending on each category and subcategory. Finally, you will be asked if you want to visualize your spending over time, where the program can create a lineplot showing the progression of your spending per month on some or all of the categories of interest. 


In [None]:
from BudgeJudy import *

To begin, enter the name of your excel workbook.


### Read in spreadsheet from data folder

In [None]:
df.head()

In [None]:
totals

### For updating categories

In [None]:
# Upload categories, subcategories and codes as JSON object to file

# path = Path("Data") / "category_descriptions.json"
# with open(path, "w") as outfile: 
#     json.dump(categories, outfile)

In [None]:
# Read in category descriptions of charges

# filepath = Path('Data') / 'category_descriptions.json'
# categories = json.loads(filepath.read_text())
# categories

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [None]:
def generator(d):
    points = list()
    for _ in np.arange(1000):
        points.append(np.random.uniform(low=-1, high=1, size=d))
    closest = pd.Series(points).apply(lambda x: (x**2).sum()).min()
    furthest = pd.Series(points).apply(lambda x: (x**2).sum()).max()
    return closest, furthest

In [None]:
nums = pd.Series(2**np.arange(20))
mins = list()
maxs = list()

for i in nums:
    closest, furthest = generator(i)
    mins.append(closest)
    maxs.append(furthest)

In [None]:
plt.plot(mins)
plt.show()

In [None]:
plt.plot(pd.Series(maxs)/pd.Series(mins))
plt.show()

```
BEGIN ASSIGNMENT
init_cell: false
export_cell: false
check_all_cell: false
files:
    - data
    - images

requirements: ../../../requirements.txt

generate:
    show_hidden: true
```

# Homework 1: Basic Python, Arrays, and DataFrames

## Due Thursday, January 25th at 11:59PM

Welcome to Homework 1! This week's homework will cover basic Python, arrays, and DataFrames. You can find additional help on these topics in [Chapter 1](https://www.inferentialthinking.com/chapters/01/what-is-data-science.html) of Computational and Inferential Thinking and [BPD 1-11](https://notes.dsc10.com/01-getting_started/tools.html) in the `babypandas` notes.


### Instructions

Remember to start early and submit often. You are given six slip days throughout the quarter to extend deadlines. See the syllabus for more details. With the exception of using slip days, late work will not be accepted unless you have made special arrangements with your instructor.

**Important**: For homeworks, the `otter` tests don't usually tell you that your answer is correct. More often, they help catch careless mistakes. It's up to you to ensure that your answer is correct. If you're not sure, ask someone (not for the answer, but for some guidance about your approach). These are great questions for office hours (the schedule can be found [here](https://dsc10.com/calendar)) or Ed. Directly sharing answers is not okay, but discussing problems with the course staff or with other students is encouraged. 

In [None]:
# Please don't change this cell, but do make sure to run it.
import babypandas as bpd
import matplotlib.pyplot as plt
import numpy as np
import otter
grader = otter.Notebook()

plt.style.use('ggplot')

<!-- END QUESTION -->



## 1. The Three Musketeers vs. Les Trois Mousquetaires


In Lecture 1, we counted the number of times that the characters Amy, Beth, Jo, Meg, and Laurie were named in each chapter of the classic book, Little Women. In programming, the word "character" also refers to a single element of a string. For instance, the string `"3 zebras!"` has 9 characters – `"3"`, `" "`, `"z"`, `"e"`, `"b"`, `"r"`, `"a"`, `"s"`, and `"!"`. 

Let's use this concept to see if *The Three Musketeers* has longer sentences in English or the original French. 

The following code generates a scatter plot in which each dot corresponds to a chapter of another classic book,  Oliver Twist by Charles Dickens. The horizontal position of a dot measures the number of periods in the chapter. The vertical position measures the total number of characters in that chapter.

In [None]:
# This cell contains code that hasn't yet been covered in the course.
# It isn't expected that you'll understand the code, but you should be able to 
# interpret the scatter plot it generates.

import numpy as np
import matplotlib.pyplot as plt

plt.style.use('fivethirtyeight')

eng_file = "./data/The_Three_Musketeers.txt"
fre_file = "./data/Les_Trois_Mousquetaires.txt"

eng_chapters = open(eng_file, encoding="utf-8").read().split('Chapter ')[67:]
fre_chapters = open(fre_file, encoding="utf-8").read().split('CHAPITRE ')[67:]

eng_periods = np.char.count(eng_chapters, '.')
fre_periods = np.char.count(fre_chapters, '.')

eng_chars = [len(c) for c in eng_chapters]
fre_chars = [len(c) for c in fre_chapters]

plt.scatter(eng_periods, eng_chars, color='b')
plt.scatter(fre_periods, fre_chars, color='g')
plt.xlabel('Periods')
plt.ylabel('Characters')
plt.legend({'English', 'French'})
plt.axis([0, 450, 0, 45000]); 

**Question 1.1.** How many periods are in the **English** chapter with the greatest number of characters? Assign either 1, 2, 3, 4, or 5 to the name `longest_English` below.

1. 35
2. 353
3. 400
4. 403
5. 37503

```
BEGIN QUESTION
name: q1_1
```

In [None]:
longest_English = 4 # Solution

In [None]:
## TEST ##
longest_English in [1,2,3,4,5]

In [None]:
## HIDDEN TEST ##
longest_English == 4

**Question 1.2.** How many characters are in the **French** chapter with the most periods? Assign either 1, 2, 3, 4, or 5 to the name `longest_French` below.

1. 332
2. 357
3. 387
4. 397
5. 407


```
BEGIN QUESTION
name: q1_2
```

In [None]:
longest_French = 1 # Solution

In [None]:
## TEST ##
longest_French in [1, 2, 3, 4, 5]

In [None]:
## HIDDEN TEST ##
longest_French == 1

**Question 1.3.** Which of the following is closest to the average number of characters per period in the **English** version of *The Three Musketeers*? Assign either 1, 2, 3, 4, or 5 to the name `chars_per_period_English` below.

1. 50
2. 100
3. 150
4. 200
5. 250


```
BEGIN QUESTION
name: q1_3
```

In [None]:
chars_per_period_English = 2 # Solution

In [None]:
## TEST ##
chars_per_period_English in [1, 2, 3, 4, 5]

In [None]:
## HIDDEN TEST ##
chars_per_period_English == 2

**Question 1.4.** Which of the following is closest to the average number of characters per period in the **English** version of *The Three Musketeers*? Assign either 1, 2, 3, 4, or 5 to the name `characters_q2` below.

1. 50
2. 100
3. 150
4. 200
5. 250


```
BEGIN QUESTION
name: q1_4
```

In [None]:
chars_per_period_English = 2 # Solution

In [None]:
## TEST ##
chars_per_period_English in [1, 2, 3, 4, 5]

In [None]:
## HIDDEN TEST ##
chars_per_period_English == 2

**Question 1.5.** Which version of *The Three Musketeers* has a higher ratio of characters per period? Assign either 1 or 2 to the name `higher_ratio` below.

1. French
2. English


```
BEGIN QUESTION
name: q1_5
```

In [None]:
higher_ratio = 1 # Solution

In [None]:
## TEST ##
higher_ratio in [1,2]

In [None]:
## HIDDEN TEST ##
higher_ratio == 1

In [None]:
grader.check("q23")

The tests for this section only check that you've set each variable to a number 1 through 5. Unlike in labs, tests in homeworks **do not** check that you answered correctly; they only check that your answer is *reasonable*, or in the correct format. To put it another way: all of your tests might pass, but that doesn't mean you'll get full credit -- some of your answers may still be wrong. It's up to you to make sure that they're right!