# 6.1 What is in our input file? 👀
<br>

If you go to the repository program, you should see that a new folder with the name of the molecule was created. For the example that we ran, the folder should be called "water". The following archives can be found:

- **COSMO_TZVPD** folder

- **energy_TZVPD** folder


## 6.1.2 What is TZVPD ❓


TZVPD stands for **Triple-Zeta Valence with Polarization and Diffuse functions**.  It is a type of **basis set** used in quantum chemistry calculations, in this case is used by the ORCA software. Basis sets are important in computational chemistry because they define the set of functions used to describe the electronic wave function of molecules. They are a collection of functions used to create and model the molecular orbitals of a system.

<div style="line-height: 1.5;">

- **<span style="color:#BA55D3;">Triple-Zeta (TZ):</span>** it means that **three** functions are used to describe each valence orbital, providing a more accurate representation compared to single-zeta or double-zeta basis sets. The valence orbitals are the outermost orbitals of an atom that are involved in chemical bonding. The different types of valence orbitals are s, p, d and sometimes f. 

- **<span style="color:#4682B4;">Valence:</span>** it refers to the valence electrons

- **<span style="color:#3CB371;">Polarization (P):</span>** adds extra features to account for how the electron cloud is distorted by the molecular environment, improving the accuracy of the calculations.
    - Think about a water molecule (H₂O). The oxygen atom has s and p orbitals that help it bond with hydrogen. When it forms bonds with hydrogen, the electron cloud around oxygen can become distorted. Adding polarization functions (e.g. d-type functions for oxygen) can help us model this distortion better.

- **<span style="color:#DAA520;">Diffuse (D):</span>** includes extra functions that are more spread out in space. These are important for accurately describing anions and systems with weakly bound electrons.
    - Think about a chloride ion (Cl⁻). The extra electron in Cl⁻ is not as tightly bound as the electrons in a neutral chlorine atom. To describe this loosely bound electron correctly, we add special functions (e.g., additional s and p functions with larger exponents) to the basis set. This allows the basis set to capture the electron's extended distribution.

You can find more about it in the [Orca documentation](https://sites.google.com/site/orcainputlibrary/basis-sets)

<div>


## 6.1.3 COSMO_TZVPD

<div style="line-height: 1.5;">

Opening the folder we see two files: **water_c000** and **log_output**. The latter contains all details of the ORCA program and the steps taken to perform the necessary calculations to produce the *.orcacosmo* file.

Let's analyze  what the *water_c000* file has:

- **<span style="color:#DB7093;">ENERGY:</span>** <br>
This is the energy obtained with the CPCM calculations i.e. as an ideal conductor in a.u. (atomic units).

- **<span style="color:#3CB371;">DIPOLE MOMENT:</span>** <br>
The values represent the x, y, and z components of the dipole moment vector. The dipole moment is defined as a vector quantity that quantifies the separation between positive and negative charges within a molecule.<br>

- **<span style="color:#20B2AA;">XYZ_FILE:</span>**<br>
Coordinates form the ORCA job. Here we can see the number of atoms (for water = 3) and the cartesian coordinates (x,y,z) in Angstroms.<br>

- **<span style="color:#CD853F;">COSMO:</span>**<br>
Molecule characteristics (no. of atoms, surface points), surface and epsilon function types (how the molecular surface is constructed for the COSMO calculations), the dielectric constant of the solvent, (measures the ability to reduce the electrostatic interactions between charged particles). Furthermore, we have the dielectric energy, (the total energy that we need to submerge our molecule from the gas phase to the CPCM phase)<br>

- **<span style="color:#9370DB;">Cartesian Coordinates, RADII and Atomic Number:</span>** <br>
Cartesian coordinates in atomic units (a.u.), atomic radii, and atomic numbers for each atom in the molecule.<br>

- **<span style="color:#FF6347;">Surface points:</span>** <br>
This section has the following columns
    - 3D coordinates of the center of the area. 
    - the area of our segments
    - the chemical potential of the segment
    - w_leb, Switch_F and G_with are part of the ORCA algorithm to generate the position of the segments and its properties
    - the indices of the atoms (for water, three atoms ∴ 0, 1, 2 )
    <br><br>

- **<span style="color:#BC8F8F;">COSMO_corrected:</span>**  <br>
Corrected values for the dielectric energy and the CPCM charges after distributing the molecule's charge between the cavities. A cavity is like a bubble surrounding the whole molecule. For some bigger atoms that have a higher charge, the electrons might be outside of the cavity, also known as outlying charge. To better represent how the molecule's electron distribution looks, ORCA implemented a correction where they distributed the outlying charge throughout the whole cavity <br>

![before](4.png) ![after](5.png)



- **<span style="color:#8B4513;">Adjacency matrix:</span>**  <br>
The connections between atoms of the molecule. Node 0 represents the oxygen atom and nodes 1 and 2 the two hydrogen atoms.

<div>

In [1]:
#⚠️⚠️⚠️
water_c000 = r"C:\Users\prisc\Desktop\orcacosmo_files\water\COSMO_TZVPD\water_c000.orcacosmo"

# Open and read the .orcacosmo file
with open(water_c000, 'r') as file:
    contents = file.read()

# Print the contents
print(contents)

FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\prisc\\Desktop\\orcacosmo_files\\water\\COSMO_TZVPD\\water_c000.orcacosmo'

## 6.1.4 energy_TZVPD  📂

The folder contains an *xyz* file with the same name: water_c000:

- **<span style="color:#9370DB;">Number of Atoms</span>**: The first line 3 indicates that the number of atoms in our molecule

- **<span style="color:#FF6347;">Energy</span>**: The energy of the molecule in the gas phase

- **<span style="color:#3CB371;">Atomic Coordinates</span>**: atomic symbols and their Cartesian coordinates in space. Each line corresponds to an atom.

In [2]:
# ⚠️⚠️⚠️
energy = r"C:\Users\prisc\Desktop\orcacosmo_files\water\energy_TZVPD\water_c000.xyz"
# Open and read the .orcacosmo file
with open(energy, 'r') as file:
    contents = file.read()

# Print the contents
print(contents)

3
energy: -76.4684885136930035
O  -0.0007809143160200  0.3969563993601200  0.0000000000000000
H  -0.7671274283801700  -0.1999882198486500  0.0000000000000000
H  0.7679083426961900  -0.1969681795114700  0.0000000000000000

