# Nanobot Design Assistant with DeepSeek-R1-Distill-Qwen-7B
Welcome to your Nanobot Design Assistant! This Jupyter notebook uses the open-source `DeepSeek-R1-Distill-Qwen-7B` model to help you design a nanobot based on your requirements. It integrates with the Materials Project API for material suggestions and provides visuals and feedback.

## Why DeepSeek-R1-Distill-Qwen-7B?
- **Open-Source**: Freely available on Hugging Face, no API costs or restrictions.
- **Reasoning Power**: Distilled from DeepSeek-R1, it excels at logical tasks like design ideation.
- **Manageable**: Runs on Colab’s T4 GPU or locally with ~16GB RAM.

## How It Works
1. **Input Requirements**: Tell us what you need (e.g., "drug delivery nanobot, <200 nm, biocompatible").
2. **Design Suggestion**: The LLM suggests a design archetype and properties.
3. **Material Search**: We find matching materials from the Materials Project.
4. **Selection with Visuals**: Pick materials with atomic structure images.
5. **Review and Feedback**: Get a summary and tips to refine your design.

Run the cells below in order—let’s get started!

In [4]:
print("Installing required packages... This may take a few minutes.")
!pip install transformers mp_api pymatgen torch ase ipython -q
print("Dependencies installed.")

Installing required packages... This may take a few minutes.
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m41.8/41.8 kB[0m [31m2.9 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m98.9/98.9 kB[0m [31m4.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.9/2.9 MB[0m [31m46.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m212.6/212.6 kB[0m [31m15.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.1/5.1 MB[0m [31m96.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m58.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m122.4/122.4 kB[0m [31m9.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

## Step 1: Import Libraries
We’ll import the tools we need:
- `transformers` for DeepSeek-R1-Distill-Qwen-7B.
- `mp_api` and `pymatgen` for material data.
- `ase` for visualizing structures.
- `torch` for model execution.

Run this cell to set up the environment.

In [5]:
import transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
import mp_api.client as mp
from pymatgen.core import Structure
import torch
from ase import Atoms
from ase.io import write
from IPython.display import Image, display
import warnings
warnings.filterwarnings("ignore")  # Suppress warnings for cleaner output



## Step 2: Load DeepSeek-R1-Distill-Qwen-7B
Here, we load the distilled DeepSeek model and tokenizer from Hugging Face. We’re using the 7B parameter version for efficiency. If you’re on Colab with a T4 GPU, it’ll use CUDA; otherwise, it defaults to CPU.

Run this cell to initialize the model—it might take a minute to download.

In [6]:
print("\nLoading DeepSeek-R1-Distill-Qwen-7B model...")
model_name = "deepseek-ai/DeepSeek-R1-Distill-Qwen-7B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16)

# Move to GPU if available (e.g., T4 on Colab)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)
print(f"Model loaded on {device}.")


Loading DeepSeek-R1-Distill-Qwen-7B model...


tokenizer_config.json:   0%|          | 0.00/3.07k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/7.03M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/680 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/28.1k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-000002.safetensors:   0%|          | 0.00/8.61G [00:00<?, ?B/s]

model-00002-of-000002.safetensors:   0%|          | 0.00/6.62G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/181 [00:00<?, ?B/s]

Model loaded on cpu.


## Step 3: Set up Materials Project API
We’ll connect to the Materials Project API to fetch material data. You’ll need an API key from [Materials Project](https://materialsproject.org/). Enter it when prompted.

Run this cell to establish the connection.

In [7]:
print("\nSetting up Materials Project API connection...")
api_key = input("Please enter your Materials Project API key: ")
mpr = mp.MPRester(api_key)
print("API connection established.")


Setting up Materials Project API connection...
Please enter your Materials Project API key: mXS1Za3SngZp4FqcMl93KW36qQSuv6tR
API connection established.


## Step 4: Define Helper Functions
These functions make our design process smoother:
- **Generate Response**: Uses DeepSeek to process your input and suggest designs/properties.
- **Visualize Material**: Shows atomic structures of suggested materials.
- **Search Materials**: Queries the API for materials matching properties.

Run this cell to set up these tools.

In [8]:
def generate_response(prompt, max_length=500):
    """Generate a response using DeepSeek-R1-Distill-Qwen-7B."""
    inputs = tokenizer(prompt, return_tensors="pt").to(device)
    outputs = model.generate(
        **inputs,
        max_length=max_length,
        temperature=0.6,  # Recommended by DeepSeek for balanced output
        top_p=0.95,
        do_sample=True
    )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

def visualize_material(material_id, component):
    """Visualize the atomic structure of a material."""
    try:
        doc = mpr.summary.search(material_ids=material_id)[0]
        structure = doc.structure
        atoms = Atoms(symbols=[site.specie.symbol for site in structure],
                      positions=[site.coords for site in structure])
        filename = f"{component}_{material_id}.png"
        write(filename, atoms, show_unit_cell=True, rotation="10x,20y,30z")
        display(Image(filename=filename))
        return True
    except Exception as e:
        print(f"Error visualizing material {material_id}: {e}")
        return False

def search_materials(props):
    """Search Materials Project for materials matching properties."""
    direct_query = {}
    if "density" in props and "max" in props["density"]:
        direct_query["density__lt"] = props["density"]["max"]
    if "elements" in props:
        direct_query["elements"] = ",".join(props["elements"])
    try:
        materials = mpr.summary.search(**direct_query, fields=["material_id", "density", "pretty_formula"])
        return [{"id": mat.material_id, "formula": mat.pretty_formula, "density": mat.density}
                for mat in materials[:3]]  # Top 3 for simplicity
    except Exception as e:
        print(f"Error searching materials: {e}")
        return []

## Step 5: Start the Design Process
Let’s dive in! This cell welcomes you and outlines the process:
1. Input your requirements.
2. Get a design suggestion from DeepSeek.
3. See suggested properties.
4. Select materials with visuals.
5. Review the design with feedback.

Run this cell to begin.

In [9]:
print("\nWelcome to the Nanobot Design Assistant!")
print("Using DeepSeek-R1-Distill-Qwen-7B, we’ll design a nanobot in 5 steps.")
print("Run each cell below in sequence, providing input when prompted.")


Welcome to the Nanobot Design Assistant!
Using DeepSeek-R1-Distill-Qwen-7B, we’ll design a nanobot in 5 steps.
Run each cell below in sequence, providing input when prompted.


## Step 6: Step 1 - Input Requirements
Tell us what you want your nanobot to do in natural language. Example: "I need a nanobot to deliver drugs to cancer cells, smaller than 200 nm, biocompatible." The more details, the better DeepSeek can tailor the design.

Enter your requirements below.

In [10]:
print("\nStep 1: Input Requirements")
user_input = input("Describe your nanobot requirements: ")
print(f"Your requirements: {user_input}")


Step 1: Input Requirements
Describe your nanobot requirements: a bot that can be magnetically controlled from outside the structure it goes in and it scratches the walls of the structure it has gone in
Your requirements: a bot that can be magnetically controlled from outside the structure it goes in and it scratches the walls of the structure it has gone in


## Step 7: Step 2 - Design Suggestion
DeepSeek will suggest a design archetype based on your input (e.g., core-shell, functionalized surface). It’ll provide a description to help you visualize it. Run this cell to see the suggestion.

In [None]:
print("\nStep 2: Design Suggestion")
prompt = f"Based on the requirements: '{user_input}', suggest a nanobot design archetype (e.g., core-shell, functionalized surface, nanoparticle cluster) and provide a brief description."
design_response = generate_response(prompt)
print("DeepSeek’s design suggestion:", design_response)

Setting `pad_token_id` to `eos_token_id`:151643 for open-end generation.



Step 2: Design Suggestion


## Step 8: Step 3 - Property Suggestion
Now, DeepSeek will suggest specific properties for each component of the design (e.g., density, elements). These come structured as a dictionary for easy use. Run this cell to see the properties.

In [None]:
import json

print("\nStep 3: Property Suggestion")
prompt = f"For the nanobot design suggested above and requirements: '{user_input}', suggest specific property requirements for each component in JSON format. Example: {{'core': {{'density': {{'max': 2.0}}, 'elements': ['Si', 'C', 'O']}}, 'shell': {{'biocompatible': true, 'elements': ['Au', 'Ag', 'Pt']}}}}"
properties_response = generate_response(prompt)
try:
    properties = json.loads(properties_response.split("```json")[1].split("```")[0] if "```json" in properties_response else properties_response)
except Exception as e:
    print(f"Error parsing properties: {e}. Using fallback properties.")
    properties = {"core": {"density": {"max": 2.0}, "elements": ["Si", "C", "O"]},
                  "shell": {"biocompatible": True, "elements": ["Au", "Ag", "Pt"]}}
print("Suggested properties:", json.dumps(properties, indent=2))

## Step 9: Step 4 - Material Search and Selection
Using the properties, we’ll search the Materials Project for matching materials. You’ll see up to 3 options per component with their formula, density, and atomic structure image. Enter the material ID to select one. Run this cell to explore and choose.

In [None]:
print("\nStep 4: Material Search and Selection")
material_suggestions = {}
for component, props in properties.items():
    print(f"\nSearching materials for {component} with properties: {props}")
    material_suggestions[component] = search_materials(props)

selected_materials = {}
for component, options in material_suggestions.items():
    if options:
        print(f"\n{component.capitalize()} material options:")
        for opt in options:
            print(f"ID: {opt['id']}, Formula: {opt['formula']}, Density: {opt['density']} g/cm³")
            visualize_material(opt["id"], component)
        choice = input(f"Enter the material ID for {component} (or 'skip'): ")
        if choice != "skip" and any(opt["id"] == choice for opt in options):
            selected_materials[component] = next(opt for opt in options if opt["id"] == choice)
            print(f"Selected {choice} for {component}")
        else:
            print(f"No valid selection for {component}")
    else:
        print(f"No materials found for {component}. Try broader properties.")
print("\nSelected materials:", selected_materials)

## Step 10: Step 5 - Review and Feedback
Here’s your final design summary! It includes the archetype, properties, and selected materials. DeepSeek will also provide feedback to refine your design. Run this cell to wrap up and get next steps.

In [None]:
print("\nStep 5: Review and Feedback")
prompt = f"Review this nanobot design:\n- Requirements: {user_input}\n- Archetype: {design_response.splitlines()[0]}\n- Properties: {json.dumps(properties, indent=2)}\n- Materials: {json.dumps(selected_materials, indent=2)}\nProvide feedback and suggestions."
feedback = generate_response(prompt)
print(f"Design Summary:\n- Archetype: {design_response.splitlines()[0]}\n- Properties: {json.dumps(properties, indent=2)}\n- Selected Materials: {json.dumps(selected_materials, indent=2)}")
print("\nDeepSeek’s Feedback:", feedback)
print("\nNext Steps: Refine based on feedback or consult a nanotechnology expert.")