brew install bazel
then restart the shell.
Follow instructions at https://bazel.build/install/ubuntu and ignore apt-transport-https failure.
Download the binary and put it somewhere in the PATH or that can be referenced directly (eg. C:\bazel.exe).
bazel --version
In order to run any code/tests that involve TensorFlow in Python (and have it execute on GPU).
One-time steps for your Mac machine:
-
Install Xcode command line tools which includes stuff like make:
xcode-select --install
-
Install the Apple Silicon version of Conda/Miniforge from here: https://github.com/conda-forge/miniforge
-
If you don't like for conda to always activate the default environment for every shell session, run this once to disable that:
conda config --set auto_activate_base false
Each time you need to create/re-create a conda environment:
-
Create a conda environment with Python 3.10. Ideally, we want to use 3.11, but tensorflow-deps currently doesn't support it. As soon as it does, we will move to it. For instance, to create an environment called 'ai' using Python 3.10:
conda create -n ai python=3.10
-
Activate the environment for the next steps:
conda activate ai
-
Install dependencies built for M1 in Apple's repo that make TensorFlow GPU possible on Mac M1:
conda install -c apple tensorflow-deps
Note that this includes some packages (built for M1) already that we would normally install separately, including:
- numpy
- grpcio
- protobuf
-
Possibly deprecated: left in case problem recurs:
Take note of grpcio and protobuf versions because due to an issue with how the packages are set up, we will have to restore them later:conda list
-
Install TensorFlow and Metal (Mac equivalent of CUDA) from the Apple team. Why you have to do this via pip instead of conda (which won't work) I cannot begin to guess.
pip install tensorflow-macos tensorflow-metal
As a side effect of this, grpcio and protobuf get upgraded to later versions, but via pip, which means they're not built for M1 anymore and will fail if used. You have to do some extra work to restore them to a working state. You can see the issue by running
python3 -m grpc
- if grpcio is working correctly, you should only see an error about importing modules, rather than a C++ compilation error, which is what you get when it's not built for M1. -
Possibly deprecated: left in case problem recurs:
Restore grpcio and protobuf to a working state. In this example, I assumed two specific versions of these libraries based on last time I did this. You should adjust based on the versions you noted fromconda list
above.pip uninstall grpcio protobuf
conda install protobuf==3.19.6 grpcio==1.46.3
python3 -m grpc
The result of the last command should be an error about importing modules, not a C++ error.
Note that this is downgrading the version of grpcio and protobuf from what tensorflow-macos installs, but making it match what tensorflow-deps installs. This is an inconsistency created by the process from the Apple team and cannot be resolved by us. It's possible there might be some side effects of the version mismatch within TensorFlow but it's hard to say.
-
Proceed with
All Environments
steps below.
NOTE: Windows GPU support is gone as of TF 2.11, but you stil need to do the driver part of the setup to support Linux/WSL.
-
Update your video driver.
-
Install the Windows version of Conda/Miniforge from here: https://github.com/conda-forge/miniforge
-
Follow these instructions which will get you to create a Conda environment (use the proper python version instead of 3.8 in the example) and install TensorFlow. While following these instructions, note the following:
- As of 5/8/23, these were the versions:
- TF 2.12.0
- Python 3.11
- CUDA 11.8
- cuDNN 8.6
- While installing CUDA toolkit, you only need to check the stuff under category "CUDA." Do not check anything that has a "current version" listed - especially "display driver" which will overwrite your driver with an old one if you check it.
- If you get an error during installation about Visual Studio version, you need to install an older version (eg. 2019). It can coexist with the newer versions.
- You may have to install TensorFlow 2.10 instead of the newer versions on Windows because TensorFlow 2.11 abandoned support for Windows GPU execution. You can still use the dependencies for later TensorFlow if you are planning to run in WSL/Linux - the TensorFlow on your Windows system is mostly only for making sure the CUDA stuff can load in that case. If you want to see whether the installed version of TensorFlow has CUDA support (regardless of whether it's actually set up on the system):
tf.test.is_built_with_cuda()
- Instead of jupyter labs like in the instructions, install
notebook
and usepython -m notebook
to run it. This is different from both Mac and Linux.
- As of 5/8/23, these were the versions:
-
Resume with the
All Environments
steps below as needed.- You can skip the jupyter installation step.
- To run the jupyter notebook, you will need to copy it into your home directory because that's how jupyter works on Windows.
- If you are just setting up on Windows for the benefit of WSL/Linux, you can probably stop after running the benchmark and not bother to set up the rest of the libraries, as you're going to set them up in WSL/Linux again.
-
Follow
Windows
setup first so that the host system has working CUDA tools that will be needed. Note that the TensorFlow from the host system will not be used in Linux, so it's ok if the version doesn't match. Make sure the Windows environment has the right CUDA versions for the version of TF you want to install in Linux. -
miniforge doesn’t seem to work for me in this context, so install miniconda instead
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
NOTE: you still get conda, so things like sharing yaml environments between mac and pc should still work- When prompted, pick a non-symlinked install location instead of ~. Otherwise, environment creation will fail. eg. /home/davidpet
- Other things like .condarc being in ~ still seem to work
-
If you don't want conda to auto-activate on each shell session:
conda config --set auto_activate_base false
-
Follow these steps to set up a Conda environment with CUDA and TensorFlow setup. You should use the same CUDA and toolkit version as the windows host, and then they'll be able to communicate so that the Linux TF can use the GPU. Note the following caveats:
- You need to reactivate the conda environment after the “export” step or it won’t apply the variables when you test tensorflow.
- You get annoying numa errors in repl, but it shouldn’t affect anything eg. in notebook you won’t see the errors from imports
- Although we're stuck with Python 3.10 on Mac for now, I've been using 3.11 on Linux because I want the performance improvements. Hopefully this inconsistency won't last long.
-
Resume with the
All Environments
steps below for each conda environment you create, but note that you have to do some extra work to get Jupyter notebooks to work:pip install –upgrade cchardet Cython chardet
- Add
export JUPYTER_ALLOW_INSECURE_WRITES=true
to your .bashrc - When calling jupyter from the shell, add the
--no-browser
flag (eg. make an alias in your .bashrc). This is needed because Jupyter run from WSL Linux will try to use the WSL Linux browser instead of the Windows one. So instead, you need to copy and paste the url (it changes, so don't bookmark it) into your browser to load the jupyter UI.
I haven't tried this yet because I use a Macbook and a PC w/ WSL, but it should work similarly to the Mac setup above but with tensorflow-gpu
instead of tensorflow-macos
. None of the other Apple conda stuff should be needed since it has nothing to do with Apple. See here
These are steps that should be performed next (for each conda environment you create) regardless of whether you're on a Mac, Linux, etc:
-
conda install jupyter
-
At this point, Tensorflow should be runnable on the GPU regardless of OS. To check that, run this notebook using a command such as
jupyter notebook --notebook-dir=~/repos/tutorials/Jupyter
. Follow the instructions in the text cells to tailor it to your environment slightly (eg. Import a DLL on Windows, change to legacy Adam on Mac, etc.). If it works, the output of the last code cell should report 1 GPU available and all 12 training epochs should complete. You can also use thewall time
as a benchmark to compare your machines or test environment optimizations. -
conda install numpy
You might not want to do this on Mac to avoid issues with tensorflow-deps (as it already installed a verison of numpy). NOTE: TensorFlow might already do this - I'll check next time I rebuild my environment.
-
conda install pandas
-
pip install tensorflow_datasets
-
The following libraries might be conda-installable to get a better version than pip (need to check next time I rebuild):
pip install matplotlib
pip install scikit-learn
pip install unittest (built-in)
pip install yapf
pip install pylint
pip install pytype [on hold until supports 3.11]
pip install termcolor
-
The following should be added to your .bashrc, .bash_profile, etc. to make Python and PyLint work correctly in development:
- PYTHONPATH set to the location of this repo so that you can import modules relative to it.
- PYLINTRC set to the .pylintrc file in this repo (TODO: revisit this procedure later).
In order to run openai_apy.py and any targets that use it, the following needs to be set up.
NOTE: technically the grpcio-tools part is not needed to run that api but is needed for the apps that use it in order to really make use of it (to allow for client/server communication). If just running python code to call openai, that step can be skipped.
-
Create a conda environment with Python 3.11.
-
If going to generate Jupyter notebooks (eg. via SnippetMaster),
conda install jupyter
(if didn't already do it from TensorFlow setup). -
pip install termcolor
-
pip install python_dotenv
-
pip install openai
-
If on Mac, and you already installed TensorFlow for GPU, make sure to fix the damaged grpcio and protobuf libraries as described in that setup above.
-
conda install grpcio-tools
This will install grpcio and protobuf as well if you don't have them yet.NOTE: if you installed TensorFlow GPU stuff and are on a Mac, this won't work, so you will have to do it in a separate Conda environment.
NOTE: I currently have it set up so that you have to have a conda environment called
bazel-protoc
with grpcio-tools that bazel builds will call into for compiling python protobufs/grpc. That is because you can't have tensorflow and grpcio-tools in the same environment on M1 due to dependency version issues. This environment does not need to be manually activated but is expected by some bazel steps.NOTE: I am currently building the grpc stuff with the newest but using the older one to run it - that may cause issues
- there is no other way because grpcio-tools on conda doesn't support old grpcio
- if any issues are noted later, I might have to look into docker-izing specific parts of the repo or something
- or let some things be broken on Mac, which would get rid of these issues, but since I use my Mac a lot, that would suck
-
Before running any openai API stuff, you need to have a file
~/openai.env
containingOPENAI_API_KEY=
followed by your API key. Do not commit this file to any repo, or people on the internet will steal your money. Alternatively, you could have it as an environment variable to override whatever is in openai.env.
TODO: add details here
-
Install Nvm via Curl and shell script
- It will download some stuff to ~/.nvm and add some stuff to ~/.bashrc (similar to what Conda does)
command -v nvm
should now outputnvm
-
nvm install node
Install latest version of Node and auto-activates it on every shell instance (silently). In the simple case, you won't need any other versions and won't need to ever activate/deactivate anything or care about Conda for TS/JS/Angular stuff. The one exception is that you need to activate a conda environment to run jupyter notebooks, so that creates a bit of a weird cross-dependency.After this step, you should be able to run
node
andnpm
from any new terminal instance without doing anything else. Global npm packages (installed withnpm -g packageName
) will install to subfolders of ~/.nvm). Without -g, they will install to anode_modules
folder in the current folder or its ancestry. -
Add JavaScript support to Jupyter Notebooks (assuming jupyter setup as in TensorFlow GPU steps above). This will apply to the current Conda environment.
npm install -g ijavascript
ijsinstall
-
npm install -g typescript
-
npm install -g ts-node
ts-node
is likenode
in that it gives you a TypeScript REPL. -
Add TypeScript supprot to Jupyter Notebooks (assuming jupyter setup as in TensorFlow GPU steps above). This will apply to the current Conda environment.
npm install -g itypescript
its --install=local
-
npm install -g @angular/cli
-
You may need to move the ng completion script below the nvm stuff in your .bashrc or .bash_profile.
-
npm install
in this repo after syncing to make sure all packaged listed in package.json get installed to node_modules locally.This is both a one-time and an ongoing step.
- prettier
- eslint
- @typescript-eslint/parser
- @typescript-eslint/eslint-plugin
- @jquery
- pnpm
- @bazel/ibazel
You probably want to set up setting sync and make a workspace for this repo (at least).
Recommended Extensions:
- bazel
Other Settings:
- file association: *.bazelrc -> shellscript (for syntax highlighting)
Recommended Extensions:
- Python (set the interpreter to the one for your environment)
- PyLint
- autoDocstring (google style)
- TensorFlow 2.0 Snippets
- Pandas Basic Snippets
- WSL (if on Windows using WSL)
Other Settings:
- Set yapf as formatting provider and add the args '--style' and 'google' for yapf
TBD
Recommended Extensions:
- Angular Essentials (John Papa)
Other Settings:
- Add this to settings.json so that prettier is used by default formatting except for Python.
editor.defaultFormatter": "esbenp.prettier-vscode", "[python]": {
"editor.defaultFormatter": "ms-python.python",
}
-
.eslint.js
TODO: add here from my other repo and make sure it works
bazel test //...
test.py
was a script I used for testing python before I converted everything to use Bazel.
I'm working on a script to run tests of only changed files relative to the latest git commit, but it's not working yet.
TODO: make hook(s)
bazel run //:buildifier
There are aliases for these here. There is also a script to format and lint all changed files in all supported languages (in progrss).
yapf --style google --recursive --in-place [repoPath]
pylint [repoPath]
- NOTE: this will catch more things than pylint in VSCode will catch
- TODO: type checking step when pytype is ready for 3.11
- Manually run formatting in jupyter notebooks changed
TODO: add details here
TODO: add here (using prettier and eslint)
-
bazel test //...
Eventually, I will have a script to only run tests that are necessary for a change.
machine_learning/spacebot/run_local.sh
to run SpaceBot client and server (both python)- see comments here for running client and server separately
- For
Kaggle Titanic
run this notebook - For the web app for SpaceBot, it is still under development, but there will likely be a shell script to spin up the Python server plus an Envoy proxy plus the Angular app (via ng serve).
- For
SnippetMaster
, runbazel run //machine_learning/snippet_master
. For now, output will be printed to the console and generated inbazel-bin/machine_learning/snippet_master/snippet_master.runfiles/__main__
. You can manually copy the outline and/or .ipynb files to a snippets repo, for instance. It is generally ok to use ctrl-c to stop in the middle of generation. Notebooks are not written until the end of each notebook, so if you don't kill it right at that moment, it's not likely to cause a problem. - For
Safron
, runbazel run //machine_learning/safron
. It will ask you for a debate topic, a number of rounds, a filename, etc. If you don't give a file, it will not write a file. Either way, the results will show at the console. The filename can contain spaces and should not be quoted. It can contain~
for your home directory. If you don't give the file an extension, it will automatically get a.txt
extension.
The root of this repo is also the root of the Angular workspace due to the presence of angular.json. The repo is set up to build angular via bazel in a hybrid way (can use either ng
or bazel
commands and still use schematics).
Angular CLI sees the SpaceBot app as spacebot-app
(eg. ng serve spacebot-app
).
The code for the app is located in the app
folder of the spacebot
subfolder within the repo.
NPM Dependencies (global):
- pnpm (for generating lock file manually - not used by build yet)
This project was generated with Angular CLI version 16.0.2.
Run ng serve spacebot-app
for a dev server. Navigate to http://localhost:4200/
. The application will automatically reload if you change any of the source files.
Run bazel run //machine_learning/spacebot/app:serve
for the bazel version.
Run ng generate component component-name
to generate a new component. You can also use ng generate directive|pipe|service|class|guard|interface|enum|module
.
Run ng build spacebot-app
to build the project. The build artifacts will be stored in the dist/
directory.
Run bazel build //machine_learning/spacebot/app
for the bazel version. The build artifacts will be stored in bazel-bin/machine_learning/spacebot/app-dist/spacebot-app
.
Run ng test spacebot-app
to execute the unit tests via Karma.
Run bazel test //machine_learning/spacebot/app:test
for the bazel version.
Run ng e2e
to execute the end-to-end tests via a platform of your choice. To use this command, you need to first add a package that implements end-to-end testing capabilities.
To get more help on the Angular CLI use ng help
or go check out the Angular CLI Overview and Command Reference page.
The openmusic folder contains Lisp libraries and OpenMusic workspaces using those libraries. It is not currently integrated into Bazel. I'm not even sure storing it in Git is the best thing, since it has a lot of artifacts and system dependencies, but for now this is where I put it.
Requirements to Use:
- Windows (because of how paths are configured)(for now)
- OM 7.1 Installed
- symlinks because OM only likes to use C:\
- C:\OpenMusic should point to openmusic/workspaces
- C:\OpenMusic Code should point to openmusic/libraries
- any paths you configure or load in OM using these workspaces should be relative to those C:\ symlink paths, not the real paths
- install OM libraries referenced by the workspaces:
- OM-JI
- OMRC
- OMCS
- Chaos
- Cloud
- Situation
- Profile
- LZ
- RepMus
- Alea
- Esquisse
- OMTristan
- OMio
- OM-Sox (different process - have to get snapshot from develop branch in Sourceforge)(retrieved on 1/8/23)
- Do "exclude process" if AV (such as Norton) tries to delete PNG of library (such as OMCS) when you load it
- Enable listener input
- If you need to use MIDI, Mac and Windows have separate steps to set up routing for that
- use Ableton/Serum on Windows and GarageBand on Mac
- Find all ToDo items throughout the repos and consolidate and/or monitor with some kind of automation.
- in the short term, the most important ones are in the BUILD file for SpaceBot
-
Encapsulate repetitive install steps with scripts where possible.
-
Document the need for extraPaths and manual import fixing to make proto imports work properly in VSCode.
-
Improve SpaceBot prompt injection rejection:
- Ideas:
- include more chat context
- semantic encapsulation (semantic containerization?) - have the AI translate the chat to a story and then write more of the story, then turn back into a chat
- break the detection into smaller steps with a rubric-like structure and aggregate
- give the alien a name and personality profile instead of just dry instructions
- Ideas:
-
Possibly Improve SnippetMaster to generate things besides programming languages (eg. library examples).
-
Add tests to SnippetMaster.
-
Web app for SnippetMaster? (maybe not - it his the API key pretty hard)
-
Add option to switch between GPT 3.5 or 4 (since 4 is so expensive)
-
Investigate if using the right model and endpoint for code generation
-
Fix SnippetMaster BUILD file generation (names are invalid)
-
Fix SnipetMaster markdown title spacing and always saying "above"
-
Make table of langauges so far and what SnippetMaster options used (and how found)
-
More SnippetMaster languages: Lisp & Clojure (new FP outline needed), Dart, C#, Bash (scripting outline needed)
-
Next Next Project: tax chatbot for my wife's business
- at first I thought this wouldn't work because of the GPT knowledge cutoff
- then after watching deeplearning.ai courses, I realized I can just download (automatically hopefully) the latest tax docs and put into a vector store, then apply the "chatting with your data" course principles
-
Other Possible Projects:
-
domain adapters for using LLM for non-LLM things
- eg. music melodies
- idea is that humans learn a lot by language and LLMs have millions of dollars of training to leverage
- limitation is that some things are also learned by direct sensory experience in addition to or instead of by language
-
get caught up on the new stuff that's been coming out (Llama, etc.)
- more deeplearning.ai short courses
-
horizontal snippet generator (considering SnippetMaster to be vertical)
- eg. cmopare what inheriting interface looks like in a list of languages
-
Alien Blaster remake (Zazzo the alien, trash talking you with AI based on gameplay)
-
-
Figure out how to make OpenMusic paths system independent
-
See if getting Lisp and OM into Bazel is desirable and feasible
- also make Lisp snippets
-
Improve the mocking in openai_api_test.py (has a rough set of fixes for the openai 1.0.0 migration)
-
Move from grpc to http via Flask to simplify the client/server communication in SpaceBot (and remove bazel dependencies including from the workspace)
-
Put in a diagnostic mode for SpaceBot so can see which responses come from the main LLM vs. moderation vs. injection detection