- cache unpkg files with localforage
- use esbuild wasm in browser
- more on shadcn chart
- shadcn chart
- use gemini free teir to summarize and hit 15 calls / min rate limit. tried to pay and used but was not able to
- use openai 4o mini to start summarize email
- still learning more about gmail api with python
- Learned chroma_db without langchain
- set up gmail read only api to fetch emails
- setup
celeryfor job queue
- setup
pineconevector database
- learn about limit of
intermediate_stepsand how usingmemorywill remember to keep track of some history between two different agent executors
agent: chain that knows how to use toolsagent executor: runs agent until it is not a function call (fancy while loop)- learn how to make tools
- seems like
chainconcept is important but not sure I am getting it...
- learn about calculate how similar vectors are (L2, Cosine similarity)
- use OpenAI Embeddings to change text -> embeddings -> save to chroma db (sqlite based vector db)
- use retriever chain to work with OpenAI chat with embeddings
- intro to LangChain
- learn LangChain has many utilities to help using LLM
- one chaine / using sequential chains to talk to openai llm
- there is concept called
Memeoryand using it was helpful to keep chat history to openai api
- hand written number recognition using multinomial-logistic-regression with gradient descent using
Tensorflow.js
- logistic-regression with gradient descent using
Tensorflow.js
- linear-regression with gradient descent using
Tensorflow.js
- learning basic of linear-regression with gradient descent
- Modal.com (run ml python function in cloud)
- intro to tensorflowJs
- train yolo with sku-110k in colab, took 7 hours and result looked good for stright image
- learn more about gradio and implemented yolo object detection with gradio
- start predicting amazon item price using LLM. Dataset comes from: https://huggingface.co/datasets/McAuley-Lab/Amazon-Reviews-2023
- intro to LangChain
- intro to RAG
- learn multiple LLM leader board websties
- learn
gr.ChatInterface(fn=func) - learn
toolsthat goes into one of the argument toopenai.chat.completions.create
- download ollama and ran deepseek-r1 locally for the first time
- ran llama3.3 70b (43GB) locally and shocked at how slow the response came back (about 1 word/min)
- save and load trained model
- trained deep learning model with about 68% accuracy on identifying dog breed with 800, 200 training and validation set
- Start building deep learning model using
keras,google/mobilenet-v2
- Turn images into Tensors
- Start learning about deep learning, TensorFlow.
- Instead of using local Jupyter Notebook, using Google colab with GPU access
- Calculated score with training and validation set
- Decreased model.fit time setting
max_samplesmuch lower than actual data size
# Change max_samples values to decrease training time
model = RandomForestRegressor(n_jobs=-1, random_state=42, max_samples=10000) # sample is about 400,000- Fill missing numerical values then fill missing non-numerical values
- Fit through RandomForestRegressor (about 400,000 rows --> takes about 1 min to fit)
- covert object type to integer using pd
category
content.astype("category").cat.as_ordered() # string value will turn into integer- it's good idea to sort by date when working with time series data
df_temp.sort_values(by=["saledate"], inplace=True, ascending=True)- start project wiht time series data (bulldozer price prediction) (regression)
- evaluation with
sklearn.metrics
- hyperparameter tuning with RandomSearchCV and GridSearchCV
- hyperparameter tuning for KNN
- correlation matrix
df.corr()
- heart-disease-project (hdp): try to understand data by looking at 2 independant variable at the same time (age vs max heart rate)
- Started heart disease project
- Analyzing the data
- how is the target distribution
df["target"].value_counts - is there any missing values?
df.isna().sum() - how is individual feature vs target looks like?
- how is the target distribution
- Learned sklearn.pipline to go through car_sales with missing data (imputer to fill up na, encoder to convert to numeric using pipeline and preprocessor)
- save & load model with pickle and joblib
- Leraned sklearn
RandomizedSearchCVandGridSearchCVfor hyperparameter tuning
- Divide set into 3: train, validate, test, train, then apply hyperparameter tuning to validation set
- Hyperparmeter Tuning by hand in RandomForestClassifier model (
n_estimators,max_depth).
- Learnd parameter vs hyperparameter concept. Will adjust hyperparameter to improve the model performance in the future
- Cross Validation Scoring for Classification and Regression
- Mean Square Error (MSE): amplify big diff btw y_test and y_pred
- Learn intro to regression model evaluation metrics
- R^2 (coefficient of determination) <-- only learned this
- Mean Absolute Error (MAE)
- Mean Squared Error (MSE)
- Learn confusion matrix
- Learning regression & classfication model using
RandomForestRegressorRandomForestClassifier - Learn
predict_probavspredict ROC curve&roc_auc_score
- Scikit-learn using
RandomForestRegressor
# refit the model
np.random.seed(42)
X_train, X_test, y_train, y_test = train_test_split(transformed_X, y, test_size=0.2)
model = RandomForestRegressor()
model.fit(X_train, y_train)
model.score(X_test, y_test)- Scikit-learn intro
- more matplot lib (style & custom subplots)
- export plot as png
- matplot lib exercise
- intro to numpy
np.array()
a3 = np.array([[[1,2,3],
[4,5,6]],
[[2,3,4],
[2,3,4]],
[[3,4,5],
[3,4,5]]])range_array = np.arange(3,7,2)randint_array = np.random.randint(1, 10, size = (2,3))
np.random.seed(seed=1) # setting seed make the output same every you run the code
random_array_1 = np.random.randint(10, size=(1,10))%timeit<-- jupyter notebook magic function- python
sum()vsnp.sum()<-- np.sum() is much faster - np.std()
- reshape and transpose
- dot product

car_sales.drop("<Col Name>", axis=1)<-- axis=1 represent columnpd.Series([1,2,3])add this as a column in DataFrame rest will fill up as NaN, if python list[1,2,3]then error
- intro to panda in Jupyter notebook: used
pd.Sereis,pd.DataFrame,pd.read_csv,{data_frame}.to_csv - decribe info with pandas using
dtypescolumns.means().info().sum()- ...etc
- select and view data with pandas:
car_sales.head() # show first 5 rows
car_sales[car_sales["Make"] == "Toyota"] # boolean indexing
car_sales.groupby(["Make"]).mean(numeric_only=True)
car_sales["Odometer (KM)"].hist()
car_sales["Price"] = car_sales["Price"].str.replace('[\$\,\,\.]', '', regex=True)
car_sales["Price"] = car_sales["Price"].astype(int)
car_sales["Price"].plot()- manipulate data with pandas
- car_sales_drop_missing = car_sales_missing.dropna()
- car_sales_missing["Odometer"] = car_sales_missing["Odometer"].fillna(car_sales_missing["Odometer"].mean())
- activate conda env
- install jupyter pandas numpy matplotlib scikit-learn (issue with higher python version while installing jupyter so had too use
conda create --name jupyter-env python=3.10)
- taking machine learning course install mini conda
Learned from
- freeCodeCamp Blog: https://www.freecodecamp.org/news/python-version-on-mac-update/
- freeCodeCamp Python API Development - https://www.youtube.com/watch?v=0sOvCWFmrtA
Steps
- Install brew
pyenv to change python version
brew install pyenvpyenv install 3.11.0echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.zshrc
echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.zshrcecho -e 'if command -v pyenv 1>/dev/null 2>&1; then\n eval "$(pyenv init --path)"\n eval "$(pyenv init -)"\nfi' >> ~/.zshrcpyenv global 3.9.2
pyenv versionsInstall VSCode extension python
Go to project and
# python3 -m venv <name>
python3 -m venv venvEnter interpreter ./venv/bin/python/
In command line
source venv/bin/activatewill make the command line prefix to be (venv)
pip install fastapi[all]pip freezewill show what libs has been installed
uvicorn main:app --reload





