<h2>Eye conditions and diseases among people</h2>
<br></br>
<h3>Introduction</h3>
<p>Eye-related conditions and diseases encompass a wide range of disorders that affect vision and overall eye health. These conditions, from common issues like refractive errors and cataracts to complex diseases such as glaucoma and age-related macular degeneration, can significantly impact quality of life. Early detection, diagnosis, and advancements in research are crucial for effective treatment and prevention, highlighting the importance of continuous study in the field of ophthalmology.</p>
<br></br>
<h3>Data set</h3>

In [8]:
from pandas import read_csv
import numpy as np

skip = [48,89,95,108,111,126,146,149,153,156,160,163,172,175,187,220,230,251,255,258,263,267,269,270,289,291,299,318,323,328,331,335] # rows where 1 subject participated multiple times according to the authors' words

data = read_csv('participants_info.csv', usecols=['age_years', 'sex', 'diagnosis1', 'diagnosis2', 'diagnosis3'], skiprows=skip)
read_csv('participants_info.csv', skiprows=skip)

Unnamed: 0,id_record,date,age_years,sex,diagnosis1,diagnosis2,diagnosis3,va_re_logMar,va_le_logMar,unilateral,rep_record,comments
0,1,2016-09-15,13,Male,Normal,,,-0.08,0.06,,,
1,2,2005-09-15,13,Female,Congenital stationary night blindness,,,0.18,0.16,,,
2,3,2019-08-08,49,Female,Orbital ischemia,Systemic disorder with ocular manifestations,,0.26,0.00,,Id:0329 - Id:0154 - Id:0049 - Id:0271,
3,4,2004-12-16,43,Female,Retinitis pigmentosa,,,,,,,
4,5,2016-07-13,47,Female,Normal,,,0.10,0.10,,,
...,...,...,...,...,...,...,...,...,...,...,...,...
299,330,2003-07-16,43,Female,Normal,,,,,,,Alteration in visual acuity
300,332,2016-11-17,43,Male,Autoimmune retinopathy,Inflammatory disease,,0.02,0.00,,Id:0221 - Id:0252 - Id:0126,
301,333,2015-01-14,4,Male,X linked retinoschisis,,,1.00,0.52,,,
302,334,2021-08-28,14,Male,Normal,,,-0.10,-0.06,,,Family history of chorioretinopathy


<h2>Comments on data before describing it</h2>
<p>The data provides a very wide information on observations upon people's eyesight, and this gives us a plenty of interpretations we can extract from the set.</p>
<br></br>
<h2>Data Selection</h2>
<p>The dataset comprises information collected from 304 subjects enrolled at IOBA, a University of Valladolid-affiliated institution in Spain. The data collection spanned an extensive period, starting from 2003 and continuing until 2022. During this extended timeframe, 23 individuals had multiple visits: 19 individuals had two visits each, 1 individual had three, another had four visits and two subjects had five visits each. As a part of the routine clinical evaluation, all subjects underwent diagnosis by ophthalmology specialists. In this particular research multiple visits are not counted.</p>
<h2>Describing data</h2>
<h4>Population</h4>
<p>The participants are people from  Valladolid, Spain, who participated in the research from 2003 to 2022</p>
<h4>Size</h4>
<p>304 people</p>
<h4>Data description</h4>
<p>The data gives us a lot of information from sex and age to the number of visits. However, the main focus is on the age of the participant, their sex and whether their eyesight is normal or they have a certain number of diseases.</p>
<h4>Variables:</h4>
<h5>Variable AGE</h5>
<p>Variable name: AGE</p>
<p>Variable description: the variable is the age of a participant</p>
<p>Variable type: continuous</p>
<p>Range: [4,86]</p>
<h5>Variable SEX</h5>
<p>Variable name: SEX</p>
<p>Variable description: the variable is the sex of a participant</p>
<p>Variable type: categorical</p>
<p>Range: [Male,Female]</p>
<h5>Variable DISEASES</h5>
<p>Variable name: DISEASES</p>
<p>Variable description: the variable is the number of diseases a participant has</p>
<p>Variable type: discrete</p>
<p>Range: [0,3]</p>
<br></br>
<h2>Data for variable DISEASES</h2>
<h4>Where does the variable come from?</h4>
<p>The data shows the number of eye diseases for any subject. Diagnosis 'Normal' is considered as 0 diseases, others are considered 1,2 or 3 diseases if there are diagnosis1, diagnosis 2 or diagnosis3 respectfully (exception is diagnosis1: 'Normal' because it means a person has 0 eye diseases)</p>
<h4>Displaying data</h4>

In [9]:
data_diseases = data[['diagnosis1','diagnosis2','diagnosis3']].copy()
data_diseases.fillna(0, inplace=True)
x = np.array(data_diseases)
x[x == 'Normal'] = 0
x[x !=0] = 1
y = np.sum(x, axis = 1)
data['diseases'] = y
data = data[['age_years','sex','diseases']]
data.to_csv('newData.csv', index = False)
data

Unnamed: 0,age_years,sex,diseases
0,13,Male,0
1,13,Female,1
2,49,Female,2
3,43,Female,1
4,47,Female,0
...,...,...,...
299,43,Female,0
300,43,Male,2
301,4,Male,1
302,14,Male,0


<h2>Why is the data relevant?</h2>
<p>The set has many observations (300+). The set has a lot of columns, so it has much statistical series, which helps to identify the information, which is going to be to extracted clearly. The records were produced by the IOBA, a University of Valladolid-affiliated institution in Spain, so we can trust the source. All these make the conditions to observe the data sufficient.</p>
<br></br>
<h2>Flat view of the first lines (displayed below for convenience)</h2>

In [10]:
data[:10]

Unnamed: 0,age_years,sex,diseases
0,13,Male,0
1,13,Female,1
2,49,Female,2
3,43,Female,1
4,47,Female,0
5,20,Male,1
6,43,Male,0
7,10,Female,3
8,13,Male,1
9,35,Female,1


<p>As we see from the data, we have participants of different age and sex. Some of them have a normal eyesight, some have up to 3 diseases. The data is not processed, and we cannot make any conclusions for the time now, but we see that the observations are satisfying in terms of information.</p>
<br></br>
<h2>Source of data</h2>
<h4>Title of the publication</h4>
<p>A Comprehensive Dataset of Pattern Electroretinograms for Ocular Electrophysiology Research: The PERG-IOBA Dataset</p>
<h4>Link</h4>
<p>https://doi.org/10.13026/d24m-w054.</p>
<h4>Authors and theme</h4>
<p>Fernández, I., Cuadrado Asensio, R., Larriba, Y., Rueda, C., & Coco-Martin, R. M. (2024). A Comprehensive Dataset of Pattern Electroretinograms for Ocular Electrophysiology Research: The PERG-IOBA Dataset (version 1.0.0). PhysioNet.</p>
<h4>Also standard citation for PhysioNet</h4>
<p>Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.</p>
<h4>Publication</h4>
<p>Published: Jan. 19, 2024. Version: 1.0.0</p>
