Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to produce new samples #112

Closed
Kafkaica opened this issue Dec 3, 2019 · 3 comments · Fixed by #119
Closed

How to produce new samples #112

Kafkaica opened this issue Dec 3, 2019 · 3 comments · Fixed by #119
Assignees
Labels
bug There is an error in the code that needs to be fixed
Milestone

Comments

@Kafkaica
Copy link

Kafkaica commented Dec 3, 2019

Hi

I am trying to find the best Bivariate fit for my data and produce new samples. When I choose the Clayton model, I receive decent data. However, when I choose Frank or Gumbel, the produced data turns out like the below figure. I was wondering if someone could help me with that.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from copulas.bivariate.base import Bivariate, CopulaTypes
from copulas.bivariate.clayton import Clayton
from copulas.bivariate.frank import Frank
from copulas.bivariate.gumbel import Gumbel
import scipy.stats as stats


""" Preparing data """
with open('Maximum Yearly Discharge.txt', 'r') as f:
    file = f.read()
lst = file.split('\n')
x = np.array([])
for i in lst:
    x = np.append(x, float(i))

with open('Prediction.txt', 'r') as f:
    file = f.read()
lst = file.split('\n')
y = np.array([])
for i in lst:
    y = np.append(y, float(i))

z = np.append(x, y)
z = np.reshape(z, (int(len(z)/2), 2), order='F')


copula = Bivariate(CopulaTypes.FRANK)
copula.fit(z)


""" Producing Samples"""
samples = copula.sample(1000)

normalized_x = (x-min(x))/(max(x)-min(x))
normalized_y = (y-min(y))/(max(y)-min(y))
plt.scatter(samples[:, 0], samples[:, 1], color='0.75', label='Simulated Data')
plt.scatter(normalized_x, normalized_y, label='Empirical Data', color='blue')
plt.xlabel('Maximum Yearly Discharge (Scaled)')
plt.ylabel('Associated Tidal height (Scaled)')
plt.legend(loc='lower right')
plt.savefig('Simulated Data.jpg')
plt.show()

Maximum Yearly Discharge.txt
Prediction.txt

@Kafkaica
Copy link
Author

Kafkaica commented Dec 3, 2019

Using Clayton for sampling
Simulated Data

using Gumbel or Frank for sampling:
Simulated Data1

@csala
Copy link
Contributor

csala commented Dec 3, 2019

Hi @Kafkaica thanks for bringing this up, and for the detailed example!
We will have a look at it as soon as we can and provide a response.

@JDTheRipperPC
Copy link
Collaborator

Hi @Kafkaica Upon reviewing it, we detected a bug in the way the samples were produced inside the Frank and Gumbel classes.
We just fixed it and a new release with this bug fix will be created soon.

Thanks for the heads up!

@JDTheRipperPC JDTheRipperPC removed their assignment Dec 23, 2019
@csala csala added the bug There is an error in the code that needs to be fixed label Dec 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug There is an error in the code that needs to be fixed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants