model: scratch: Create Linear Regression model #59

pdxjohnny · 2019-05-08T17:45:40Z

./scripts/create.sh model scratch

Then fill out model with your implementation of linear regression. (Maybe linear.py)

The text was updated successfully, but these errors were encountered:

yashlamba · 2019-06-03T20:11:47Z

dffml/model/tensorflow/dffml_model_tensorflow/model/dnn.py

Lines 149 to 187 in b6e45e8

    
               async def features(self, features: Features): 
        
                   ''' 
        
                   Converts repos into training data 
        
                   ''' 
        
                   cols: Dict[str, Any] = {} 
        
                   for feature in features: 
        
                       col = self.feature_feature_column(feature) 
        
                       if not col is None: 
        
                           cols[feature.NAME] = col 
        
                   return cols 
        
               def feature_feature_column(self, feature: Feature): 
        
                   ''' 
        
                   Creates a feature column for a feature 
        
                   ''' 
        
                   dtype = feature.dtype() 
        
                   if not inspect.isclass(dtype): 
        
                       LOGGER.warning('Unknown dtype %r. Cound not create column' % (dtype)) 
        
                       return None 
        
                   if dtype is int or issubclass(dtype, int) \ 
        
                           or dtype is float or issubclass(dtype, float): 
        
                       return tensorflow.feature_column.numeric_column(feature.NAME, 
        
                               shape=feature.length()) 
        
                   LOGGER.warning('Unknown dtype %r. Cound not create column' % (dtype)) 
        
                   return None 
        
               def model_dir_path(self, features: Features): 
        
                   ''' 
        
                   Creates the path to the model dir by using the provided model dir and 
        
                   the sha384 hash of the concatenated feature names. 
        
                   ''' 
        
                   if self.parent.config.directory is None: 
        
                       return None 
        
                   model = hashlib.sha384(''.join(features.names()).encode('utf-8'))\ 
        
                           .hexdigest() 
        
                   if not os.path.isdir(self.parent.config.directory): 
        
                       raise NotADirectoryError('%s is not a directory' % (self.parent.config.directory)) 
        
                   return os.path.join(self.parent.config.directory, model)

@pdxjohnny I was implementing applicable features and found it leaded to the following functions. Do they need to be re-implemented or can I pick them up from dnn and change the conditions (for starters the feature length should be 2 and so on)

yashlamba · 2019-06-03T20:13:57Z

And can you suggest me how to debug my code like how can I go about testing (for now checking that whether I have received data successfully or not)?

pdxjohnny · 2019-06-03T22:14:25Z

Here's an outline for 2.a.iv and 2.a.v from https://docs.google.com/document/d/16u9Tev3O0CcUDe2nfikHmrO3Xnd4ASJ45myFgQLpvzM/edit#heading=h.s3lkoesyhz9v

Is this what you're talking about?

class SimpleLinearRegression(Model):
    async def applicable_features(self, features):
        if len(features) != 1:
            raise ValueError("simple LR only supports a single feature")
        if features[0].dtype() != int and features[0].dtype() != float:
            raise ValueError("simple LR only supports int or float feature")
        if features[0].length() != 1:
            raise ValueError("simple LR only supports single values (non-matrix / array)")
        features_we_care_about = [features[0].NAME]
        return features_we_care_about

    async def train(self, sources, features):
        features_we_care_about = self.applicable_features(features)
        async for repo in sources.with_features(features_we_care_about):
            # Grab a subset of the feature data being stored within the repo
            # The subset is the feature_we_care_about and the feature we are want to predict
            feature_data = repo.features(features_we_care_about + [self.parent.config.predict])
            xData.append(feature_data[features_we_care_about[0]])
            yData.append(feature_data[self.parent.config.predict])

yashlamba · 2019-06-04T06:55:07Z

Here's an outline for 2.a.iv and 2.a.v from https://docs.google.com/document/d/16u9Tev3O0CcUDe2nfikHmrO3Xnd4ASJ45myFgQLpvzM/edit#heading=h.s3lkoesyhz9v

Is this what you're talking about?

class SimpleLinearRegression(Model):
    async def applicable_features(self, features):
        if len(features) != 1:
            raise ValueError("simple LR only supports a single feature")
        if features[0].dtype() != int and features[0].dtype() != float:
            raise ValueError("simple LR only supports int or float feature")
        if features[0].length() != 1:
            raise ValueError("simple LR only supports single values (non-matrix / array)")
        features_we_care_about = [features[0].NAME]
        return features_we_care_about

    async def train(self, sources, features):
        features_we_care_about = self.applicable_features(features)
        async for repo in sources.with_features(features_we_care_about):
            # Grab a subset of the feature data being stored within the repo
            # The subset is the feature_we_care_about and the feature we are want to predict
            feature_data = repo.features(features_we_care_about + [self.parent.config.predict])
            xData.append(feature_data[features_we_care_about[0]])
            yData.append(feature_data[self.parent.config.predict])

well, this is pretty simple and understandable, I was using dnn as my reference wherein applicable features lead to self.features which in turn lead to feature_feature_column (which ultimately checked the types).

pdxjohnny · 2019-06-04T17:26:47Z

ya dnn is complicated by tensorflow APIs :)

pdxjohnny added this to To do in Beta Release - 0.5.0 via automation May 8, 2019

pdxjohnny added the enhancement New feature or request label May 8, 2019

pdxjohnny added this to the 0.5.0 Beta Release milestone Jun 27, 2019

pdxjohnny closed this as completed Jul 31, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model: scratch: Create Linear Regression model #59

model: scratch: Create Linear Regression model #59

pdxjohnny commented May 8, 2019

yashlamba commented Jun 3, 2019

yashlamba commented Jun 3, 2019

pdxjohnny commented Jun 3, 2019

yashlamba commented Jun 4, 2019

pdxjohnny commented Jun 4, 2019

model: scratch: Create Linear Regression model #59

model: scratch: Create Linear Regression model #59

Comments

pdxjohnny commented May 8, 2019

yashlamba commented Jun 3, 2019

yashlamba commented Jun 3, 2019

pdxjohnny commented Jun 3, 2019

yashlamba commented Jun 4, 2019

pdxjohnny commented Jun 4, 2019