<a href="https://colab.research.google.com/github/deltorobarba/chemistry/blob/main/basis_set.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Basis sets are fundamental in quantum chemistry calculations, especially in methods like Hartree-Fock, DFT, and post-Hartree-Fock methods (e.g., MP2, CCSD). A **basis set** defines a set of functions used to describe the orbitals of atoms and molecules. These functions are typically combinations of Gaussian-type orbitals (GTOs) that approximate the behavior of electrons.

### **Types of Basis Sets:**

1. **Minimal Basis Sets:**
   - **STO-3G** (Slater-Type Orbitals approximated by 3 Gaussians):
     - **Description**: This is a minimal basis set, meaning it uses the smallest number of functions necessary to describe each orbital (e.g., one basis function per orbital).
     - **Use**: It's often used for quick, low-accuracy calculations, typically for larger systems where computational efficiency is critical.
     - **Limitations**: The accuracy is quite limited, especially for systems where electron correlation or polarization effects are important.

2. **Split-Valence Basis Sets:**
   - **3-21G** or **6-31G**:
     - **Description**: These basis sets use a different number of Gaussian functions for core electrons (e.g., 3 or 6 Gaussians) versus valence electrons (e.g., 2 or 1 Gaussians). The 6-31G set, for example, describes core orbitals with 6 Gaussians and valence orbitals with a split basis (1 part described with 3 Gaussians, another with 1).
     - **Use**: These are a step up in accuracy from minimal basis sets and are commonly used for moderate-sized systems in general-purpose calculations.
     - **Limitations**: While more accurate than STO-3G, these still do not account for polarization or diffuse effects, which may be important for more complex systems or anions.

3. **Polarized Basis Sets:**
   - **6-31G(d)** or **6-31G(d,p)**:
     - **Description**: These add polarization functions, such as **d** or **p** orbitals, which allow the electron density to adjust to the molecular environment. For instance, 6-31G(d) adds d-type functions to atoms, improving the description of electron distributions.
     - **Use**: Polarized basis sets are essential for studying molecular geometries, vibrational frequencies, and reactions because they provide a better description of the electron cloud's flexibility.
     - **Limitations**: More computationally expensive than standard split-valence sets, but much more accurate for systems where electron polarization is significant.

4. **Diffuse Basis Sets:**
   - **6-31+G** or **6-31++G**:
     - **Description**: These basis sets add diffuse functions (denoted by "+"), which are functions with very small exponents to describe electrons far from the nucleus (e.g., anions or excited states). The "++" refers to adding diffuse functions for both heavy atoms and hydrogen.
     - **Use**: These are critical for systems with loosely bound electrons, such as anions, Rydberg states, and excited states.
     - **Limitations**: Adding diffuse functions increases the computational cost, but it’s necessary for accurately treating weakly bound electrons.

5. **Correlation-Consistent Basis Sets:**
   - **cc-pVDZ (Correlation-Consistent Polarized Valence Double Zeta)**, **cc-pVTZ**, **cc-pVQZ**:
     - **Description**: These basis sets are designed for post-Hartree-Fock methods and are optimized for correlating electron pairs (used in MP2, CCSD). They are designated as **double-zeta** (DZ), **triple-zeta** (TZ), or **quadruple-zeta** (QZ), where a higher "zeta" value means more flexibility and a better description of the electron correlation.
     - **Use**: Commonly used in accurate wavefunction-based calculations, such as MP2, CCSD(T), and coupled-cluster methods, especially for high-precision studies of electronic correlation and molecular properties.
     - **Limitations**: They are computationally expensive but highly accurate. The cost increases with higher levels (e.g., cc-pVTZ vs. cc-pVDZ), but so does accuracy.

6. **Augmented Correlation-Consistent Basis Sets:**
   - **aug-cc-pVDZ**, **aug-cc-pVTZ**:
     - **Description**: These are correlation-consistent basis sets that include **augmented diffuse functions**. They are indicated by the "aug-" prefix.
     - **Use**: Necessary for high-accuracy calculations involving systems with weakly bound electrons, such as van der Waals interactions, anions, and excited states.
     - **Limitations**: They provide extreme accuracy but at a significant computational cost.

### **Summary of Basis Set Applications:**

| **Basis Set**    | **Type**                    | **Usage**                                  | **Typical Application**                                          |
|------------------|-----------------------------|--------------------------------------------|------------------------------------------------------------------|
| **STO-3G**       | Minimal                     | Low accuracy, fast                         | Large molecules, preliminary calculations                        |
| **3-21G / 6-31G**| Split-Valence                | General-purpose                            | Medium-accuracy molecular calculations                           |
| **6-31G(d)**     | Polarized Split-Valence      | Improved geometry and electronic structure | Molecular geometries, vibrational frequency studies               |
| **6-31+G**       | Diffuse Basis Set            | Anions, excited states                     | Anions, systems with loosely bound electrons                     |
| **cc-pVDZ**      | Correlation-Consistent (DZ)  | High-precision electron correlation        | Post-Hartree-Fock methods (MP2, CCSD), accurate property studies |
| **cc-pVTZ**      | Correlation-Consistent (TZ)  | High accuracy, larger basis set            | Highly accurate property studies and correlation calculations    |
| **aug-cc-pVDZ**  | Augmented Correlation Cons.  | Anions, weak interactions, excited states  | Van der Waals forces, excited state calculations                 |

### **Basis Set Choice Considerations:**
- **Minimal Basis Sets** (e.g., STO-3G) are typically only used when speed is a priority and accuracy is less important.
- **Split-Valence and Polarized Basis Sets** (e.g., 6-31G, 6-31G(d)) are often a good compromise between accuracy and cost.
- **Diffuse and Augmented Basis Sets** (e.g., 6-31+G, aug-cc-pVDZ) are essential for systems with weakly bound electrons or when studying anions or excited states.
- **Correlation-Consistent Basis Sets** (e.g., cc-pVDZ, cc-pVTZ) are critical for high-accuracy, post-Hartree-Fock methods.

These basis sets are chosen based on the trade-off between accuracy and computational cost, with more complex systems (or those requiring higher accuracy) demanding larger and more sophisticated basis sets.

The difference between **aug-cc-pVDZ** and **cc-pVTZ** lies in two key aspects: **augmentation** (the "aug-" prefix) and **zeta quality** (double-zeta vs. triple-zeta). Let’s break down what these terms mean:

### 1. **Augmentation ("aug-" prefix)**
- **aug-cc-pVDZ**: The "aug-" prefix stands for **augmented**. Augmented basis sets include **diffuse functions**, which are additional Gaussian functions with small exponents designed to capture the behavior of electrons far from the nucleus. Diffuse functions are crucial when studying:
  - **Anions**: Systems with loosely bound electrons.
  - **Excited states**: Electrons in higher energy states, farther from the nucleus.
  - **Weak interactions**: Systems involving van der Waals forces or other long-range interactions.

- **cc-pVTZ**: This set is **not augmented**, meaning it does not have these additional diffuse functions. It is still a high-accuracy basis set, but without the specific capabilities needed to describe weakly bound or highly spread-out electron clouds.

### 2. **Zeta Quality (Double-Zeta vs. Triple-Zeta)**
- **cc-pVDZ** (part of the name aug-cc-pVDZ) is a **double-zeta (DZ)** basis set. In the context of basis sets, "zeta" refers to how many functions are used to describe each orbital. A double-zeta set uses two sets of basis functions per orbital (one for the inner part of the orbital and one for the outer), which gives a more flexible description of electron behavior than minimal or single-zeta sets.
  
- **cc-pVTZ** is a **triple-zeta (TZ)** basis set. A triple-zeta set uses three sets of basis functions per orbital, providing even greater flexibility and accuracy in the description of the electron density. Triple-zeta sets are particularly valuable when high accuracy is needed for bond dissociation, reaction energies, or accurate electron correlation calculations.

### **Summary of Key Differences**:

| **Basis Set**   | **Zeta Quality**      | **Diffuse Functions**      | **Typical Applications**                                |
|-----------------|-----------------------|----------------------------|---------------------------------------------------------|
| **aug-cc-pVDZ** | Double-Zeta (DZ)       | Yes (augmented)             | Anions, weak interactions (e.g., van der Waals), excited states |
| **cc-pVTZ**     | Triple-Zeta (TZ)       | No (not augmented)          | High-accuracy ground state calculations, correlation effects |

### **Application Differences:**
1. **aug-cc-pVDZ** is better suited for systems where diffuse electron clouds play a significant role, such as:
   - Anions (e.g., negative ions)
   - Rydberg states (highly excited electronic states)
   - Weakly bound systems (e.g., van der Waals complexes)

2. **cc-pVTZ**, being a triple-zeta basis set, offers more accuracy for ground-state properties and electron correlation but is not ideal for capturing long-range, diffuse electron behavior. It is commonly used in high-accuracy studies, especially in conjunction with correlated methods like MP2 or CCSD(T), but it's less suited for diffuse systems.

In summary, **aug-cc-pVDZ** is more suitable for calculations involving weakly bound or highly spread-out electrons, while **cc-pVTZ** is better for high-accuracy calculations where electron correlation in the ground state is crucial but without the need for diffuse functions.