# ChemML Basic TutorialWelcome to ChemML! This notebook will introduce you to the basics of computational chemistry and machine learning with ChemML.## Learning Objectives- Set up ChemML environment- Load and explore molecular data  - Perform basic molecular preprocessing- Create simple predictive models- Visualize results

## 1. Environment Setup

In [None]:
# ChemML Setupimport sysimport warningswarnings.filterwarnings('ignore')# Core ChemML importsimport chemmlfrom chemml.preprocessing import MoleculePreprocessorfrom chemml.models import ChemMLModelfrom chemml.visualization import ChemMLVisualizer# Optional integrations (with graceful fallbacks)try:    from chemml.integrations.experiment_tracking import setup_wandb_tracking    HAS_TRACKING = Trueexcept ImportError:    HAS_TRACKING = False    print("⚠️  Experiment tracking not available")# Display ChemML infoprint(f"🧪 ChemML {chemml.__version__} loaded successfully!")if HAS_TRACKING:    print("📊 Experiment tracking available")

## 2. Data Loading

In [None]:
# Load sample datafrom chemml.datasets import load_molecules# Load dataset with error handlingtry:    data = load_molecules()    print(f"✅ Loaded {len(data)} samples")except Exception as e:    print(f"❌ Could not load data: {e}")    # Fallback to demo data    data = {"molecules": ["CCO", "CC(C)O", "CCCCO"], "properties": [1.2, 1.5, 1.8]}    print("📊 Using demo data instead")print(f"Data keys: {list(data.keys())}")

## 3. Molecular Preprocessing

In [None]:
# Initialize preprocessorpreprocessor = MoleculePreprocessor()# Process molecules with error handlingtry:    processed_data = preprocessor.fit_transform(data["molecules"])    print(f"✅ Processed {len(processed_data)} molecules")    print(f"Feature shape: {processed_data.shape}")except Exception as e:    print(f"❌ Preprocessing failed: {e}")    print("💡 Try checking your molecule format (SMILES expected)")

## 4. Model Training

In [None]:
# Initialize modelmodel = ChemMLModel(model_type="random_forest")# Train model with error handlingtry:    model.fit(processed_data, data["properties"])    print("✅ Model trained successfully")        # Make predictions    predictions = model.predict(processed_data)    print(f"Predictions: {predictions[:5]}")  # Show first 5except Exception as e:    print(f"❌ Model training failed: {e}")    print("💡 Check that data shapes match")

## 5. Visualization

In [None]:
# Create visualizationsvisualizer = ChemMLVisualizer()# Plot results with error handlingtry:    visualizer.plot_predictions(data["properties"], predictions)    print("✅ Visualization created")except Exception as e:    print(f"❌ Visualization failed: {e}")    print("💡 Using matplotlib backend")        # Fallback to simple matplotlib    import matplotlib.pyplot as plt    plt.scatter(data["properties"], predictions)    plt.xlabel("True Values")    plt.ylabel("Predictions")    plt.title("Model Predictions")    plt.show()

## SummaryCongratulations! You've completed the basic ChemML tutorial. You learned how to:- ✅ Set up the ChemML environment- ✅ Load molecular datasets- ✅ Preprocess molecular data- ✅ Train predictive models- ✅ Visualize results### Next Steps- Try the intermediate tutorials- Experiment with different molecular datasets- Explore advanced model types- Set up experiment tracking