# Stock Comparator Usage Examples
Before we start, 
1) did you make the working directory exactly "stock_comparator"? If not, the program will error. I could have been smarter about this if I had more time, but since I base all my pathing off of basepath = "./", I need to make sure we start in the right directory!

2) do you have yfinance installed? This is only needed if you increase the number of stocks above 500, or if you delete files in the ./data folder. I recommend to delete a few files just to make sure my program does what I claim it does. :)

## 1) Standard Use Case

### 1.1) Import

In [None]:
from stock_comparator import StockComparator

### 1.2) Make a StockComparator Instance
Note - it is at THIS TIME that the results folder for a specific instance is made.
The results folder is ./results/YYYY_mm_dd_HHMMSS

In [None]:
master_file = "nasdaq_screener_1636426496281.csv"

comp = StockComparator(master_file, num_stocks=500,
                            smooth_window="2W", chg_period=10)

### 1.3) How many processes to use?
I recommend leaving some headroom, i.e. if you have 12 cpu threads, do about 8 processes. 
We need to leave some resources for the system.

In [None]:
from os import cpu_count
cpu_count()

### 1.4) Start Multiprocessing Batch Correlation
If executing this script externally, each process makes its own output to the terminal. This makes a lot of outputs.
If you are in an iPython terminal (Jupyter or Spyder), the processes don't output to the terminal, so you don't get any output while it's running. Additionally, there is some risk that the child processes won't inherit the module (at least in spyder). This is why I recommend to run this program in a terminal.

In [None]:
comp.batch_correlate_multiprocess(p=8)

### 1.5) Results and Saving
Results are saved in `self.corr_results`. Let's show the highest correlating 10 results.

In [None]:
comp.corr_results.head(10)

### 1.6) Recalculating Results and Plotting
When first cross-correlating the stocks, I don't return the arrays of the correlation signal and lags between them (that would be an insane amount of data). The point of this next bit is to perform the correlation again on the top correlating `n` number of stocks, but this time returning the full correlation array for that comparison. Additionally, we can set a minimum and maximum lag threshold (`lag_min` and `lag_max`, respectively), so that non-lagging stock combinations are ignored. Only combinations of stocks with lag values having an absolute value between `lag_min` and `lag_max` are kept. Finally, plot the top `plot_top` number of stocks into the results directory.

In [None]:
comp.plot_top_correlations(n=200, lag_min=3, lag_max=60, plot_top=10)

### 1.7) Top Results
The top results are stored in `self.results_top`, and also exported to the results directory.

In [None]:
comp.results_top